Category Archives: apache-crunch

Does Apache Crunch come with the Hadoop MapReduce API?

When you download Apache Crunch from their website (it comes as source code), it comes without the related MapReduce classes it's based on. Two questions:

1- How is this possible? Apache Crunch is an abstraction on top of MapReduce. How come it isn't packaged with the MapReduce classes?

2- What do I need to do to develop using Apache Crunch? Do I need to download Crunch and MapReduce separately? If so, how can I know which MapReduce version I need to match the Crunch version?

crunch MRPipeline failing on MapSideJoinStrategy

hi i am working on a crunch job using mapside join strategy where I am able to process this job using MemPipeline but I am failing to run this job using MRPipeline don't why this is happening ..

I am trying to construct a MapSideJoinStrategy on two PTables passing small table on left , bigger one on right it is failing to create a join on these tables .. it joins those tables using MemPipeline fails on MRPipeline .. any input on this appreciated ..