By providing a distributed data storage and parallel computing framework, Hadoop has evolved from an abstraction of clustered computing to a big data operating system. This book aims to provide data scientists with an in-depth understanding of specific subject areas by providing an overview of cluster computing and analysis in a readable and intuitive manner, introducing Hadoop cluster computing and analysis from a data scientist's perspective. The book is divided into two parts, the first part introduces distributed computing at a very high level, discussing how to run computing on a cluster; The second part focuses on the tools and techniques that data scientists should know to power various analytics and large-scale data management.
Tourists, if you want to see the hidden content of this post, please Reply
|