Case studies

Silicon Valley bank
Establish a central data repository holding all banking enterprise balance sheet data that can be used for Risk related analytics and fed reporting.

Data Management layer to support Security, Governance, Standard, maintainability & self service capability.

Ability to perform data analytics and data science for Ad Hoc reporting.

Build a data warehouse on all Mortgage-Backed Securities loan level data by connecting with three agencies (Fannie Mae, Freddie Mac & Ginnie Mae), can be used for MBS investment exposure.


  • Central conformed data layer was created using Hive as a single view for all of business data.
  • Hierarchies, GL balances were stored as reference data.
  • Cloudera Navigator was used to store Data Lineage, data dictionary.
  • Sqoop was used for data ingestion
  • Data was layered to facilitate data governance and functional separation to improve data consistency


  • Spark
  • Sqoop
  • Hive
  • Cloudera Navigator
  • Tableau
  • AIrflow
  • Oozie


Centralized data hub with modern architecture using Cloudera Hadoop that can handle data expansion, governance, security and maintainability. Reduced data management cost by 60% providing self service capability.

Bank of the west
As you requested, please find below listed accomplishment note.
Business Problem: A number of SOX critical business processes were handled manually by multiple business teams to pull the data from multitude of sources, run data quality checks, data transformations, generate reconciliation and management review control check reports with varying frequencies such as weekly, monthly, quarterly and yearly.
They were often very tedious, error prone processes and involved lot of back and forth between the teams due to manual nature. Estimated effort was several hours by multiple team members on an on-going basis.
Solution: Worked with the key business stakeholders, captured the requirements and implemented end to end business process data analytics automation workflow solutions to handle complex, heterogenous and large volumes of data (structured and unstructured) processing, cleansing, blending, enrichment and preparation of key insights and reports by putting the automated data controls in place.
Technology Stack: Alteryx, SQL Server, Oracle, MongoDB, SharePoint On-Prem & Cloud, Microsoft office, Data Warehouse
Outcome: Automation resulted in removal of manual processes, bring efficiencies, automated data check controls for SOX/audit/regulatory needs, gain significant time reduction (~99%) and increased productivity for the business users, empowering them to take data-driven decisions with quicker turnaround times