We live in the era of Big Data where we are witnessing a continuous increase on the computational power that produces an overwhelming flow of data which called for a paradigm shift in the computing architecture and large scale data processing mechanisms. In general, the growing demand for large-scale data mining and data analysis applications has spurred the development of novel solutions from both the industry (e.g., web-data analysis, click- stream analysis, network-monitoring log analysis) and the sciences (e.g., analysis of data produced by massive-scale simulations, sensor deployments, high-throughput lab equipment). Although parallel database systems serve some of these data analysis applications (e.g. Teradata, Vertica, Greenplum, Netezza), they are expensive, difficult to administer and lack fault-tolerance for long-running queries. The MapReduce framework, open sourced by Apache Hadoop, has been introduced by Google for programming commodity computer clusters to perform large-scale data procesing. The framework is designed in a way that a MapReduce cluster can scale to thousands of nodes in a faulttolerant manner. These features has enabled Apache Hadoop to ahcieve worldwide adoption in different businesses and application domains

Objective of the Book

The objective of this book is to provide a central source of reference on the different data management techniques of large scale data processing and its technology application. This book will contain open-solicited and invited chapters written by leading researchers, academics, and practitioners in the field. All chapters will be reviewed by three independent reviewers. The book will cover the state-of-the-art, as well as the latest research discoveries and applications, thus making it a valuable reference for a large community of audiences.

Target Audience unkempt

This book will be an important reference to researchers and academics working in the interdisciplinary domains of databases, data mining and web scale data processing and its related areas such as data warehousing, data mining, social networks, bioinformatics, semantic web, and so forth. It will also be a potential resource for senior undergraduate and postgraduate students.


Recommended topics include, but are not limited to, the following:

  • NoSQL data stores and DB scalability
  • Cloud data management architectures
  • Scalable and parallel relational database management systems
  • MapReduce and other processing paradigms for analytics
  • Big Data placement, scheduling, and optimization
  • Big Data analytics and visualization
  • Distributed file systems for Big Data
  • Programming models for Big Data processing
  • Parallel query processing and optimization
  • Data management and analytics for vast amounts of unstructured data (XML, RDF, Graphs)
  • Benchmarking, tuning, and testing
  • Data science and analytics technologies
  • Large scale scientific data management
  • Clustering, classification and link analysis of Big data
  • Scalable data mining and machine learning techniques
  • Scalable distributed stream processing systems
  • Energy efficiency and energy-efficient designs for analytics
  • Debugging and performance analysis tools for analytics and data-intensive computing
  • Industrial experience and use cases

Submission Procedure

Researchers and practitioners are invited to submit on or before October 30, 2012, a 2-3 page chapter proposal that includes book chapter title, author names, contact information with short bio, key words and an abstract that clearly explains the mission, concerns and the outline of the proposed chapter. Authors of accepted proposals will be notified by November 30, 2012 about the status of their proposals and sent chapter guidelines. Full chapters are expected to be submitted by February 28, 2013. A book chapter should not exceed 30 pages. All submitted chapters will be reviewed on a double-blind review basis. Please submit your proposal and manuscript to:


The book is scheduled to be published by CRC Press. The history of the CRC, Taylor& Francis group has paralleled the history of science. For 95 years the CRC publisher has recorded the accomplishments of pioneering thinkers and researchers across all of science and technology, while also providing the resources and tools needed by those seeking to continue that progress. Recognized as a pioneer in the scientific publishing industry, CRC Press maintains a reputation that is as extraordinary in its depth as it is global in its depth. The CRC Press reaches around the globe with authoritative coverage of traditional and emerging fields, publishing the pioneering achievements of science and technology to provide professionals and students with the resources they need to make further advances. For additional information regarding the publisher, please visit This publication is anticipated to be released by the end of 2013.

Important Dates

October 30, 2012: Proposal Submission Deadline

November 15, 2012: Notification of Acceptance

February 15, 2013: Full Chapter Submission

March 30, 2013: Review Results Returned

May 30, 2013: Final Chapter Submission

Advisory Board

Ashraf Aboulnaga University of Waterloo, Canada

Rajkumar Buyya University of Melbourne, Australia

Amr El Abbadi University of California Santa Barbara, USA

Ian Gorton Pacific Northwest National Laboratory, USA

Haixun Wang Microsoft Research

Albert Zomaya University of Sydney, Australia


  • Sherif Sakr

    School of Computer Science and Engineering

    University of New South Wales, Australia


  • Mohamed Medhat Gaber

    School of Computing

    University of Portsmouth