CSU 670 Spring 2009 Out: January 13, 2009 Due: January 20, 2009 PART 1: ========================================For this subproject we fill in the unspecified concepts from subproject 1. What was left unspecified was the concept of a predicate T of a derivative, the concept of raw material satisfying predicate T and the concept of finishing the raw material with a certain quality.
Our raw material are Boolean formulas in conjunctive normal form (CNF). A CNF consists of a list of clauses. The same clause may appear several times: so our CNFs are a bag of clauses. Each clause consists of a list of literals. A literal is either a positive or a negative variable.
raw material = CNF = bag of clauses
The concept of finishing the raw material amounts to finding an assignment that satisfies many clauses (ideally it maximizes the number of satisfied clauses over all possible assignments). An assignment assigns true or false to each variable of a CNF. An assignment is represented by a clause: a positive literal means the variable is set to true, a negative literal means the variable is set to false. A clause is satisfied by an assignment if at least one literal in the clause is satisfied. A positive literal is satisfied if its variable is assigned true and a negative literal is satisfied if its variable is assigned false. The quality of an assignment is the fraction of satisfied clauses among the total number of clauses.
finished product = CNF plus assignment
We need to define the concept of a predicate for a CNF. A predicate for a CNF is a set of clause types that may be used in the CNF. The type of a clause is a pair: the first element is the length of the clause (the number of literals) and the second element the number of positive literals. The clause (a !b !c) is of type (3,1) and the clause (!a !b !c !d) is of type (4,0). For example the CNF ((a) (b) (c) (!a !b) (!b !c)) satisfies the predicate ((1,1) (2,0)) containing two clause types.
Note that when you buy a derivative you buy it based on its predicate and its price. So you will need to figure out whether a price for a given predicate is fair. The uncertainty you have is different than with the weather, or exchange rates.
uncertainty ----------- weather: how nice will the weather be? knowledge to reduce uncertainty: model of weather behavior. interest rates: how high will the rate be? knowledge to reduce uncertainty: model of economy behavior. Specker Derivative Game: how much can I satisfy in the CNF I get? knowledge to reduce uncertainty: model of CNF
Complete the object-oriented design from project 1 with the new information.
What to turn in: Your extended object-oriented design as a cd file.
PART 2: ===================================================================== MapReduce Algorithms Read the Google paper about how they use MapReduce: http://googlesystem.blogspot.com/2008/01/google-reveals-more-mapreduce-stats.html This is an exercise about finding an efficient algorithm for a problem by finding a way of expressing the algorithm in a restricted computational model. Google is successfully using the MapReduce programming model. Here are two paragraphs from the Google MapReduce paper (2004): MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. =================== We apply the MapReduce programming model to implementing SDG but without executing the algorithms on a cluster consisting of hundreds of machines. Instead we express MapReduce in DemeterF and run the program on a single machine with possibly multiple cores. Here is a simple DemeterF program inspired by Google's MapReduce paper: it computes word occurences in a list of ducuments. The idea is to create an entry or pair (w,1) for each word w and to merge those entries together. The class dictionary and behavior file, etc. are here: /home/lieber/.www/courses/csu670/sp09/project/project2/mapReduceDemeterF This program uses the TUCombiner function object to express the map and the reduction conveniently. The map is expressed here: (It is a bit confusing that the map returns a Map) public Mapcombine(){ return empty; } Map combine(Word n){ return empty.put(n,1); } The first line creates an empty Map and the second line creates a map containing one entry (or pair): (n,1) for word n. The reduction is expressed by the fold function which folds two maps together. For example the Map-object: ((n,1) (m,3)) and the Map-object ((n,6) (o,7)) are folded together: ((n,7) (m,3) (o,7)). http://www.ccs.neu.edu/research/demeter/DemeterF/doc/edu/neu/ccs/demeterf/TUCombiner.html tells you more about TUCombiner. Please note that the Java class Map is implemented in the DemeterF library. It is a functional Map but has many of the same features as the Java library Map class. Your task is to implement a program to collect all derivatives that are currently for sale in the store into a list using the MapReduce programming model. For help inspect the files: general.cd classic.cd ListTUCombiner.java and find the UNKNOWN below: /** Returns a list of derivatives that are available for sale * @param store * @return {@link List} */ public static List findDerivativesForSale(List > store) { return TUCombiner.traverse(store, new DerivativesForSale()); } class DerivativesForSale extends ListTUCombiner { UNKNOWN } Part 3: ============================================================================== In your team of two students play a few rounds of SDG/secret as described in http://www.ccs.neu.edu/home/lieber/evergreen/specker/secret-hiding.html Plug in MAX-SAT as the maximization problem (given a weighted CNF, maximize the fraction of satsfied constraints). Remember the pair notation: (length, number of positive literals). Consider: d1 = (SDG/secret MAX-SAT (2,0) (1,1), p1, Matt) d2 = (SDG/secret MAX-SAT (1,0) (1,1), p2, Matt) How would you choose p1 and p2? Turn in the history of the game you played.