Assignment 5: MapReduce Get familiar with Google's MapReduce [1] and its programming model. Implement a sample application with an open source framework of your choice; e.g. Hadoop [2], Twister [3], etc. In case you cannot come up with an own idea, you may base your application on the sample applications of the tutorials. Take a look at Amazon web services, such as the Amazon Elastic Cloud (EC2) or Amazon Elastic MapReduce. Browse the web for news/articles about applications implemented with MapReduce (and Amazon). E.g., the New York Times created some million PDF documents from several TB of scanned documents. Test your application locally and on a (at least pseudo-) distributed system and document performance. Please submit your implementation as zip without binaries but including a readme (what is it, how does it work, how to compile/run it, ...) and your performance observations. Explain MapReduce (and be prepared to answer some questions about the paper!!) and present/run your implementation in the PS. Deadline: Monday 30.05. --> E-Mail! Subject: [VS] Presentation/Discussion: in the PS: Wednesday 01.06. [1] http://labs.google.com/papers/mapreduce-osdi04.pdf [2] http://hadoop.apache.org/mapreduce/ [3] http://www.iterativemapreduce.org/