Our Successful Hadoop Implementations
September 23, 2018
September 23, 2018
One of our clients is a financial services company with operations worldwide. We helped our client successfully deliver an end-to-end data warehousing solution, including data quality governance, data visualization, and data security.
We handled Infrastructure and Operations of the Big Data Analytics platform hosted on a Secure Multi-Tenant Hadoop Cluster. The Big Data Analytics platform using Cloudera Distribution of Hadoop supports multiple use cases like Structured/Semi-structured Data Archival, Unstructured Document Archival, Data Visualization, Modeling and Data Science on the data in the data lake.
Our team of Hadoop Engineers designed, administered and supported the below stack on the Big Data Analytics platform.
· HIVE as data warehouse
· KUDU as real-time columnar data store
· SQOOP for Batch Data Ingestion
· KAFKA and FLUME for Real-time Data Ingestion
· STREAMSETS for building dataflows
· IMPALA for interactive and batch SQL Querying
· R, Python, and SPARK for analytics
· OOZIE for scheduling Workflows
· SOLR for content and metadata indexing of documents
· Cloudera Data Science Workbench as Self Service Data Science notebook
· Cloudera Navigator for Data Governance
Some of our recent successful deliveries are
· Migration from Hive on MapReduce to Hive on Spark
· Integration of Data Visualization tools like Tableau and Tibco Spotfire with Hadoop using Impala
· Installation and Configuration of Streamsets, Cloudera Data Science Workbench (CDSW) and KUDU
Recent Comments