[Intro to Hadoop and MapReduce] Lesson 2 Problem set

[Intro to Hadoop and MapReduce] Lesson 2 Problem set

1. Quiz: Dimensions of Big Data

Which of the following are Part of the 3 dimensions of Big Data?

  • Volume
  • Cost
  • Importance
  • Velocity
  • Source
  • Variety
  • Security
  • Virality

2. Quiz: Volume

Volume of Big Data refers to:

  • Importance of Data
  • Size of data
  • Speed of data generation
  • The differnet data sources

3. Quiz: Hadoop Ecosystem

Check all that are true:

  • Hadoop provides an efficient way of storing data via HDFS
  • Hadoop has a visualization framework called ‘Giraffe’
  • You can analyze large datasets using a high-level language called ‘Pig’
  • ‘Hive’ offers a SQL-like language on top of MapReduce
  • The tools in Hadoop’s ecosystem are all proprietary, commercial tools

4. Quiz: Variety

Hadoop can store data in which of these formats?

  • XML
  • Text
  • JSON
  • Any non-binary format
  • Any format

5. Quiz: History of Hadoop

Hadoop was:

  • Written by Google and released as open source
  • Originally part of an open source project called Nutch
  • Written by Nutch, Inc as a proprietary product but then released as open source
  • Developed by the US government and released as open source