UMBC Computer Science Alumni Donald Miner (BS '06, PhD '10) (left) and Adam Shook (BS '09, MS expected '13) (right) have written a book on the popular MapReduce paradigm that has revolutionized the way collections of computers are used to process large amounts of data in parallel. Their book, MapReduce Design Patterns Building Effective Algorithms and Analytics for Hadoop and Other Systems, was published by O'Reilly Media in December.
“Adam and I were teaching Hadoop classes and we saw a gap: students would pick up on how hadoop worked mechanically, but struggled to understand how to solve problems with it,” explains Donald, who now works as a Solutions Architect at EMC Greenplum. “This book is intended for people who have a basic understanding of Hadoop, but want to start solving their problems effectively.”
Adam and Donald met in fall 2008 during an Artificial Intelligence class at UMBC. Donald, a Ph.D. student working with Dr. Marie desJardins on machine learning and multiagent systems research, was teaching the class. Adam, an undergraduate Computer Science student, was taking the class. Later, the pair ended up working together at ClearEdge IT Solutions.
“We worked well together and our skills and interests complemented each other well, so when I had the opportunity to write this book I knew it would be a much better book doing it with him than doing it alone,” says Donald.
Now Adam works with big data technologies like Hadoop, Accumulo, Pig, and ZooKeeper as a Software Engineer at ClearEdge IT Solutions. He is working towards his Master’s in Computer Science at UMBC under Dr. Tim Finin. His research deals with developing an efficient in-memory distributed database for Semantic Web applications.
“I don’t know how I do it,” says Adam about working full time, being in graduate school, and writing a book. “Caffeine helps.” He plans on finishing up his degree in 2013.
In the meantime, Adam and Donald have started working on Mapreducepatterns.com. Still in its infancy, the website is meant to be a Wikipedia style site where the Hadoop community can rally around a well-defined set of standard MapReduce design patters.