Enjoy fast, free delivery, exclusive deals, and award-winning movies & TV shows with Prime
Try Prime and start saving today with fast, free delivery

Kindle
$25.34 - $43.42

Available instantly

Paperback
$27.63 - $42.94

$42.94 with 34 percent savings

List Price: $64.99

FREE Returns

FREE delivery Wednesday, May 7 to Nashville 37217

Or fastest delivery Tuesday, May 6

Only 1 left in stock - order soon.

$$42.94 () Includes selected options. Includes initial monthly payment and selected options. Details

Enhancements you chose aren't available for this seller. Details

${cardName} not available for the seller you chose

${cardName} unavailable for quantities greater than ${maxQuantity}.

Ships from

Amazon

Sold by

SUNSHINEBOOKS1144

Returns

30-day refund/replacement

Payment

Secure transaction

Add a gift receipt for easy returns

$27.63

Get Fast, Free Shipping with Amazon Prime FREE Returns

Pages are clean and crisp. No bent pages or corners. Free of marks, writing, or highlighting. Cover and/or jacket intact with no wear. Spine in excellent condition. Orders are fullfilled and shipped directly from Amazon. Satisfaction always guaranteed. Pages are clean and crisp. No bent pages or corners. Free of marks, writing, or highlighting. Cover and/or jacket intact with no wear. Spine in excellent condition. Orders are fullfilled and shipped directly from Amazon. Satisfaction always guaranteed. See less

FREE delivery Wednesday, May 7 to Nashville 37217 on orders shipped by Amazon over $35

Or Prime members get FREE delivery Monday, May 5. Order within 6 hrs 53 mins.

Only 1 left in stock - order soon.

$$42.94 () Includes selected options. Includes initial monthly payment and selected options. Details

Access codes and supplements are not guaranteed with used items.

Enhancements you chose aren't available for this seller. Details

${cardName} not available for the seller you chose

${cardName} unavailable for quantities greater than ${maxQuantity}.

Sold by The Kingsmen's Library and Fulfilled by Amazon.

Other sellers on Amazon

New & Used (25) from $25.59 & FREE Shipping

Image Unavailable

Image not available for
Color:

To view this video download Flash Player

Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large Scale Data Analysis 1st ed. Edition

by Mohammed Guller (Author)

18 ratings

See all formats and editions

{"desktop_buybox_group_1":[{"displayPrice":"$42.94","priceAmount":42.94,"currencySymbol":"$","integerValue":"42","decimalSeparator":".","fractionalValue":"94","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"2Tch2AVMjfXWzYygTu6d4ENH2eXGVC4AofqzGpijiI1YwocMcLwtN3a6dLWgFkoE%2F9inUM9neMFfIWSt3dBZUFB1ZUFhZwpPg%2F6yKlJ%2FHOaJntaSB8aYeBFaZb0tio%2BbhoGnmXQ%2FkE8%2BDgFQ5cXdcpjBz1CMcqD77kRht%2FUce2HRLcYq693oftF%2BZ5gOVtsO","locale":"en-US","buyingOptionType":"NEW","aapiBuyingOptionIndex":0}, {"displayPrice":"$27.63","priceAmount":27.63,"currencySymbol":"$","integerValue":"27","decimalSeparator":".","fractionalValue":"63","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"2Tch2AVMjfXWzYygTu6d4ENH2eXGVC4AYlPCl0%2FT7bU%2BJaBSMJIXsCtbz2ZvAiBxKIA6Bx20P0SB6gMn%2BWtwEpnvn5Hd2BsZjOE9jnqB6RaLaZf59VbCW020maR%2Fci00hjCfwXzuUCRxIgenOvMf%2BPsbT1P8C7%2F6vogUX6xE8ma0ZjmnoX7peOAFlqykiKZ2","locale":"en-US","buyingOptionType":"USED","aapiBuyingOptionIndex":1}]}

Purchase options and add-ons

Big Data Analytics with Spark is a step-by-step guide for learning Spark, which is an open-source fast and general-purpose cluster computing framework for large-scale data analysis. You will learn how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. In addition, this book will help you become a much sought-after Spark expert.

Spark is one of the hottest Big Data technologies. The amount of data generated today by devices, applications and users is exploding. Therefore, there is a critical need for tools that can analyze large-scale data and unlock value from it. Spark is a powerful technology that meets that need. You can, for example, use Spark to perform low latency computations through the use of efficient caching and iterative algorithms; leverage the features of its shell for easy and interactive Data analysis; employ its fast batch processing and low latency features to process your real time data streams and so on. As a result, adoption of Spark is rapidly growing and is replacing Hadoop MapReduce as the technology of choice for big data analytics.

This book provides an introduction to Spark and related big-data technologies. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, and MLlib. Big Data Analytics with Spark is therefore written for busy professionals who prefer learning a new technology from a consolidated source instead of spending countless hours on the Internet trying to pick bits and pieces from different sources.

The book also provides a chapter on Scala, the hottest functional programming language, and the program that underlies Spark. You’ll learn the basics of functional programming in Scala, so that you can write Spark applications in it.

What's more, Big Data Analytics with Spark provides an introduction to other big data technologies thatare commonly used along with Spark, like Hive, Avro, Kafka and so on. So the book is self-sufficient; all the technologies that you need to know to use Spark are covered. The only thing that you are expected to know is programming in any language.

There is a critical shortage of people with big data expertise, so companies are willing to pay top dollar for people with skills in areas like Spark and Scala. So reading this book and absorbing its principles will provide a boost—possibly a big boost—to your career.

Editorial Reviews

Review

“Programmers seeking to learn the Spark framework and its libraries will benefit greatly from this book. … The book is well written, with a good balance between presenting simple computer science concepts, such as functional programming, and introducing Scala, the Spark core language. … the book provides substantial information on cluster-based data analysis using Spark, a prominent framework used by data scientists. It is very nicely written, with interesting contemporary considerations and several source code examples.” (Andre Maximo, Computing Reviews, computingreviews.com, June, 2016)

About the Author

Mohammed Guller is the principal architect at Glassbeam, where he leads the development of advanced and predictive analytics products. He is a big data and Spark expert. He is frequently invited to speak at big data–related conferences. He is passionate about building new products, big data analytics, and machine learning.

Over the last 20 years, Mohammed has successfully led the development of several innovative technology products from concept to release. Prior to joining Glassbeam, he was the founder of TrustRecs.com, which he started after working at IBM for five years. Before IBM, he worked in a number of hi-tech start-ups, leading new product development.

Mohammed has a master's of business administration from the University of California, Berkeley, and a master's of computer applications from RCC, Gujarat University, India.

Product details

ASIN ‏ : ‎ 1484209656
Publisher ‏ : ‎ Apress; 1st ed. edition (December 25, 2015)
Language ‏ : ‎ English
Paperback ‏ : ‎ 300 pages
ISBN-10 ‏ : ‎ 9781484209653
ISBN-13 ‏ : ‎ 978-1484209653
Item Weight ‏ : ‎ 1.25 pounds
Dimensions ‏ : ‎ 7.01 x 0.69 x 10 inches

Best Sellers Rank: #598,588 in Books (See Top 100 in Books)
- #110 in Database Storage & Design
- #258 in Statistics (Books)
- #301 in Data Processing

Customer Reviews:
18 ratings

Brief content visible, double tap to read full content.

Full content visible, double tap to read brief content.

Videos

Help others learn more about this product by uploading a video!

Upload your video

Customer reviews

18 global ratings

5 star
25%
4 star
39%
3 star
36%
2 star
0%
1 star
0%

How customer reviews and ratings work

Review this product

Write a customer review

Top reviews from the United States

There was a problem filtering reviews. Please reload the page.

TravelOr
It's a very good book for anyone to get started
Reviewed in the United States on January 30, 2016
Verified Purchase

Mr. Guller has done a fabulous job by writing "Big Data Analytics with Spark". It's a very good introductory book for anyone to get started. It walks through numerous set of examples, including a primer to Bigdata and Scala. Then it moves on to Spark. A must have for all the beginners.

Read more

Helpful

Report
Abhishek Srivastava
Good (but very basic) book. More breath than depth
Reviewed in the United States on May 14, 2016
Verified Purchase
I liked the book. it nice and simple and good for beginners.

One issue I found is that this book covers the the entire spark eco-system (spark core, spark sql, spark streaming, mllib) in a very brief way. it doesn't go deep into any of the topics. This makes it very similar to the spark programming guide on the web which also adopts a similar pattern.

What would have been nice if the author would have gone deep into spark-core. Upon reading this book, it didn't teach me anything more than the spark programming guide

Read more

6 people found this helpful

Helpful

Report
Ian Stirk
If you want to learn Spark, buy this book. Highly recommended
Reviewed in the United States on February 16, 2016
Hi,

I have written a detailed chapter-by-chapter review of this book on www DOT i-programmer DOT info, the first and last parts of this review are given here. For my review of all chapters, search i-programmer DOT info for STIRK together with the book's title.

This book aims to provide a “...concise and easy-to-understand tutorial for big data and Spark”. How does it fare?

Spark is increasing the tool of choice for big data processing, being much faster than Hadoop’s MapReduce. After putting Spark into a big data context, the book aims to cover Spark’s core library, together with its more specialized libraries for Streaming, Machine Learning, SQL, and Graphing.

The book is aimed at developers that are new to Spark, some general background programming knowledge required, but little else.

Chapter 1 Big Data Technology Landscape

This chapter opens with a discussion about the current big data age, with data as the lifeblood of organizations, and growing exponentially. The standard 3Vs definition of big data is explored (velocity, variety, volume). Traditional relational database management systems (RDBMS) are unable to process these large volumes in a timely manner – this is where the scalability of big data systems comes into its own.

Next, the chapter discusses some technologies that are either used with Spark, or Spark competes with. The first technology is Hadoop, this is fault tolerant and scalable, and runs on commodity hardware. The three major components of Hadoop are discussed: YARN (Yet Another Resource Negotiator), MapReduce (distributed processing model), and HDFS (Hadoop Distributed File System). Spark is increasingly being used in place of MapReduce owning to its faster speed. The section briefly discusses Hive, a data warehouse with a SQL like interface, Spark SQL is expected to supersede Hive on many systems.

The chapter continues with a look at some common binary formats for serializing (storing on disk) big data, and their pros and cons. Specifically Avro, Thrift, Protocol Buffers, and SequenceFile are examined. Next, some column storage formats, which have performance advantages when the client requires a subset of columns, were briefly discussed, namely: RCFile, ORC, and Parquet.

Then a brief overview of messaging systems is provided, together with the advantages of having a layer of abstraction between producers and consumers. Specifically, Kafka and ZeroMQ are discussed with the aid of useful supporting diagrams.

NoSQL is then examined. The various types of NoSQL databases have different aims to the traditional RDBMS, typically trading Atomicity, Consistency, Isolation, Durability (ACID) for scalability and flexibility. The specific NoSQL databases briefly discussed are Cassandra and HBase. I sometimes wonder if it is meaningful to group NoSQL databases together. Is it meaningful to divide sports into Football and NoFootball? Are all the NoFootball sports meaningful as a group?

The chapter ends with a look at some distributed SQL query engines, these do not use MapReduce batch jobs, and are thus more oriented to interactive querying. The engines briefly examined are: Impala, Presto, and Apache Drill.

This chapter provides an excellent overview of big data technology. It should be noted there are many more technologies than described, but the examples given are sufficient to explain the topic areas. This is possibly the best backgrounder to big data I’ve read.

The discussions are very well written, concise and clear, with helpful diagrams, and no wasted words. There’s a good flow between the topics, and useful links between chapters. There are website links for further information. These traits apply to all the chapters in the book.
.
.
.
Conclusion

This book aims to provide a “...concise and easy-to-understand tutorial for big data and Spark”, and clearly succeeds. The book is exceptionally well written. Helpful explanations, diagrams, practical step-by-step walkthroughs, annotated code, inter-chapter links, and website links abound throughout.

The book is aimed at developers that are new to Spark, and explains concepts from the beginning. If you work through the book you should become competent in the use of Spark, there is much more to learn of course, but this book gives a solid foundation in both core Spark and its major specialized libraries: Streaming, Machine Learning, SQL, and Graphing.

The book is based on workshops given by the author, and clearly the feedback from these has been useful in creating this book, since it seems to have answered all the questions I had.

This book provides everything you need to know to get started with Spark, explained in an easy-to-follow manner. If you want to learn Spark, buy this book. Highly recommended

Read more

7 people found this helpful

Helpful

Report
SAS
Good overview of Spark (using Scala)
Reviewed in the United States on July 26, 2016
After picking up the basics of Scala (from books like Scala for the Impatient, the Scala CookBook and blogs), I tried reading up on Spark. This book was very useful for me because its in Scala, and Scala only. Unlike some other books that show samples in Java/Python/Scala, having only Scala reduces clutter and bulk. But yes, unfortunately, it is not in Python.
The writing is pretty good, the editing is also good (I am mentioning this because a bunch of Scala/Spark books out there have terrible, sometimes incomprehensible, language)
The examples are fairly complete. The sections on SQL, Spark-Streaming are less verbose and more example-filled. After each example, there is description of the code.
Of all the books so far, this had the most pleasant introduction to Machine Learning. Its purely from a software development perspective - with sample use cases and show casing an appropriate MLLib library. The Spark ML section is a bit rushed, but has enough samples to get started via blogs, and the Apache site.

Read more

Helpful

Report

Top reviews from other countries

Frank R.
A must-have for every Spark practitioner
Reviewed in the United Kingdom on January 23, 2016
Verified Purchase

This book is both a reference and a tutorial. It covers the full length of Spark's features and has numerous well thought examples that you can select, copy-paste and use in your applications. It also has great images and plots and especially in the ML and Graph sections it helps you understand quickly what a method is all about. It's a must-have for every Spark practitioner.

Read more
Report
Amazon Customer
meh...Try googling first
Reviewed in the United Kingdom on December 27, 2016
Verified Purchase
Not really worth it in my opinion - you would be able to find similar material in blogs or the Spark documentation

Read more
Report
Centelles
... have just finished the book and I have really enjoyed it. It gives you an overall idea of ...
Reviewed in the United Kingdom on March 21, 2016
Verified Purchase
I have just finished the book and I have really enjoyed it. It gives you an overall idea of how Spark works, which can be a bit overwhelming and fuzzy at first. It is written in a very concise and straight way. At the end, you see how all these pieces in the puzzle fit together! Great book

Read more
Report

See more reviews

Amazon Prime includes:

Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large Scale Data Analysis 1st ed. Edition

Purchase options and add-ons

Editorial Reviews

Review

About the Author

Product details

Videos

Customer reviews

Review this product

Images in this review

Top reviews from the United States

There was a problem filtering reviews. Please reload the page.

4.0 out of 5 stars It's a very good book for anyone to get started

3.0 out of 5 stars Good (but very basic) book. More breath than depth

5.0 out of 5 stars If you want to learn Spark, buy this book. Highly recommended

4.0 out of 5 stars Good overview of Spark (using Scala)

Top reviews from other countries

5.0 out of 5 stars A must-have for every Spark practitioner

3.0 out of 5 stars meh...Try googling first

5.0 out of 5 stars ... have just finished the book and I have really enjoyed it. It gives you an overall idea of ...

It's a very good book for anyone to get started

Good (but very basic) book. More breath than depth

If you want to learn Spark, buy this book. Highly recommended

Good overview of Spark (using Scala)

A must-have for every Spark practitioner

meh...Try googling first

... have just finished the book and I have really enjoyed it. It gives you an overall idea of ...