Data warehouse platform
Creating a centralized place for all the integrated data sources for a pharmaceutical company
Our client is a leading technology-enabled life science services company that simplifies how patients get on and stay on drug therapies.
The company works with biopharmaceutical manufacturers to help commercialize and maximize the benefits of branded and specialty medications.They leverage industry-leading, tech-enabled services to accelerate speed to therapy throughout the medication journey. The unparalleled patient and provider network and advanced analytics power our ability to improve medication adherence and access.
After acquisition the company faced challenges with integrating data from various data sources. Due to the diversity of data, data analysis became difficult and time-consuming. As the company operates with personal, sensitive data, they faced the need to introduce a centralized repository of all business information that is needed for internal and external purposes.
As a solution to the problem we proposed to design and implement Data Warehouse aimed to serve as central repository for all integrated data sources. Since customer required to gather data from diverse data sources like flat files and different databases, we had to define the approach for data integration
We have implemented data extract, validation and integration in staging area (database), hosted in customer colocated environment. Sensitive data de-identification was implemented to meet customer data security requirements.
As the next step we have identified data sources load priority and designed the rules for rewriting data into enterprise data warehouse (EDW).
We have introduced audit functionality that ensures capturing matrix of data load status changes (Quantity, load duration etc ). We’ve also introduced a unique identifier for each load which ensures that data is connected to some identifier. This allows to clean data in case of load issues or failures caused by incorrect source data received or reload of data with updated data set.
We’ve chosen AWS Redshift as a main storage. And loaded data via s3 bucket by using delete-insert flow. Not to affect users, all of the synchronization is done during the nightly performance window.
We’ve implemented integrated star schema storage that contains structured data from diverse sources and provided an option of effortless data analysis in it. EDW Star schema allows the use of ad hoc reporting. And kept sensitive data in co-located Stage with corresponding mapping to hashed data in EDW. This allows to analyze data with keeping all metrics available.
Outcomes of Project & Success Metrics:
Structured and up-to-date key data co-location in cloud EDW
Separate cloud DB to load and store low value or inconsistent data for further review and analysis
Ability to review data load progress and track the process
PHI sensitive data de-identified
Why choose Divectors
We’ve put our experience into helping our customers achieve their business goals. Take a look at some of the projects Divectors successfully worked on:
A team of professionals with over 11 years of experience that will serve to the growing need of your business
We apply standard Project Management processes and procedures that are customized according to client needs and business requirements.
We use the best Design Thinking practices to get to the bottom of the problem
Agile team of software developers, experience designers, project managers and business analysts that apply the best Agile / Scrum practices