{"id":12059,"date":"2014-12-23T22:00:00","date_gmt":"2014-12-23T22:00:00","guid":{"rendered":"http:\/\/cloudcomputing.sys-con.com\/node\/3268486"},"modified":"2014-12-23T22:00:00","modified_gmt":"2014-12-23T22:00:00","slug":"data-lake-and-data-refinery-gartner-controversy-cloudexpo-bigdata","status":"publish","type":"post","link":"https:\/\/icloud.pe\/blog\/data-lake-and-data-refinery-gartner-controversy-cloudexpo-bigdata\/","title":{"rendered":"Data Lake and Data Refinery \u2013 Gartner Controversy! | @CloudExpo [#BigData]"},"content":{"rendered":"<p>The concept of a \u2018data lake\u2019 was coined by James Dixon of Pentaho Corp. and this is what he said \u2013 If you think of a datamart as a store of bottled water \u2013 cleansed and packaged and structured for easy consumption \u2013 the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples. Think of a data lake as an unstructured data warehouse, a place where you pull in all of your different sources into one large \u201cpool\u201d of data. In contrast to a data mart, a data lake won\u2019t \u201cwash\u201d the data or try to structure it or limit the use cases. Sure, you should have some use cases in mind, but the architecture of a data lake is simple: a Hadoop File System (HDFS) with lots of directories and files on it.<\/p>\n<p><a href=\"http:\/\/cloudcomputing.sys-con.com\/node\/3268486\" >read more<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The concept of a &lsquo;data lake&rsquo; was coined by James Dixon of Pentaho Corp. and this is what he said &ndash; If you think of a datamart as a store of bottled water &ndash; cleansed and packaged and structured for easy consumption &ndash; the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples. Think of a data lake as an unstructured data warehouse, a place where you pull in all of your different sources into one large &ldquo;pool&rdquo; of data. In contrast to a data mart, a data lake won&rsquo;t &ldquo;wash&rdquo; the data or try to structure it or limit the use cases. Sure, you should have some use cases in mind, but the architecture of a data lake is simple: a Hadoop File System (HDFS) with lots of directories and files on it.<\/p>\n<p><a href=\"http:\/\/cloudcomputing.sys-con.com\/node\/3268486\" target=\"_blank\">read more<\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-12059","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts\/12059","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/comments?post=12059"}],"version-history":[{"count":1,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts\/12059\/revisions"}],"predecessor-version":[{"id":13609,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts\/12059\/revisions\/13609"}],"wp:attachment":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/media?parent=12059"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/categories?post=12059"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/tags?post=12059"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}