Sunday, March 20, 2016

Sqoop Vs Flume


    • Apache Sqoop and Apache Flume work with various kinds of data sources.
      Apache Flume functions well in streaming data sources which are generated continuously in hadoop environment such as log files from multiple servers whereas
      Apache Sqoop is designed to work well with any kind of relational database system that has JDBC connectivity. Sqoop can also import data from NoSQL databases like MongoDB or Cassandra and also allows direct data transfer or Hive or HDFS. For transferring data to Hive using Apache Sqoop tool, a table has to be created for which the schema is taken from the database itself.
    • In Apache Flume data loading is event driven whereas in
      Apache Sqoop data load is not driven by events.
    • Apache  Flume is a better choice when moving bulk streaming data from various sources like JMS or Spooling directory whereas
      Apache Sqoop is an ideal fit if the data is sitting in databases like Teradata, Oracle, MySQL Server, Postgres or any other JDBC compatible database then it is best to use Apache Sqoop.
      In Apache Flume, data flows to HDFS through multiple channels whereas in
      Apache Sqoop HDFS is the destination for importing data.
    • Apache Flume agents are designed to fetch streaming data like tweets from Twitter or log file from the web server whereas
      Apache Sqoop connectors are designed to work only with structured data sources and fetch data from them.
    • Apache Flume has agent based architecture i.e. the code written in flume is known as agent which is responsible for fetching data whereas in
      Apache Sqoop the architecture is based on connectors. The connectors in Sqoop know how to connect with the various data sources and fetch data accordingly.
    • Apache Sqoop is mainly used for parallel data transfers, for data imports as it copies data quickly whereas
      Apache Flume is used for collecting and aggregating data because of its distributed, reliable nature and highly available backup routes.

    No comments: