#FutureReadyHealthcare

Who We Are
Investor RelationsNews
Careers
Indegene

Data Interoperability with Salesforce Data Cloud

Customer 360 Profile Migration Explained with a Practical Case Study
First-party customer data are the cornerstone for achieving brand success, especially in a highly regulated industry such as life sciences, where precision and consent in communication are paramount. Rich first-party data empower brands with the ability to create customer 360 profiles, which, in-turn, enable personalized and tailored experiences for their customers.
Now imagine creating customer 360 profiles for hundreds of thousands of your customers, only to potentially lose all stored rich data when a decision is made to move to a different platform that better serves your organizational needs.
This white paper sheds light on a proven methodology to achieve data interoperability between systems for semi-structured data, and emphasizes the preservation of critical customer data while establishing an adaptable ecosystem. To illustrate, let us delve into the process of migrating pre-existing customer 360 profiles to a customer data platform (CDP). This solution was designed and deployed by our team, comprising CDP experts and data analysts, which leveraged a python-based solution for data interoperability.
Setting the Stage
Let us consider a real-world example of an organization aiming to create a connected MarTech ecosystem to deliver integrated and personalized customer experiences at scale through a CDP platform – Salesforce Data Cloud, in this instance.
While the new platform seemed like a natural fit with their vision of a connected ecosystem, a key factor in the migration decision was the utilization of existing unified customer profiles in CDP-driven customer journey orchestration. With this migration, the organization was evaluating the feasibility of moving unified customer 360 profiles, collected over a two-year period from online and offline channels, to Salesforce Data Cloud.
These profiles contained rich customer data from online and offline engagement channels such as websites, webinars, conferences, email engagements, and rep interactions. Aside from the website channel, all other sources had known customers (i.e., customers with a master data management [MDM] ID). The website channel had both known web users (users with a mapped MDM ID through a campaign management tool) and unknown web users (users without a mapped MDM ID).
Key Considerations for Salesforce Data Cloud Data Ingestion
The considerations for moving existing customer profiles to Salesforce Data Cloud could be broadly classified into two categories – platform and data.
Platform Considerations
AWS (Amazon Web Services) S3 (Simple Storage Service) was identified as a potential solution for ingesting semi-structured customer data in to the Salesforce Data Cloud, but the CDP platform allowed ingestion only in the form of structured data with a unique primary key.
The Salesforce Data Cloud mandates the classification of data during the creation of a data stream. The inbound data have to be classified into one of the following categories:
Engagement (e.g., website interaction),
Profile (e.g., CRM profile),
Other (e.g., product information).
Each data stream has a Data Lake Object (DLO) created automatically by the Salesforce Data Cloud. Salesforce Data Cloud uses a customer 360 data model, consisting of Data Model Objects (DMOs) representing groups of data such as website engagement and contact email. Once these data are available at the DLO layer, we need to manually map it to DMOs. These can be mapped to either the standard DMO attributes or custom attributes in standard objects or custom DMOs. It is to be noted that no transformation of data is possible inside Salesforce Data Cloud, emphasizing the importance of conducting data transformation before ingesting data.
Sample data stream illustrating the permissible data format in salesforce data cloud
Data Considerations
The source data were in the JSON (JavaScript Object Notation) format, and the structure of the JSON file for each customer varied. The profiles contained only attributes relevant to each customer, leading to non-standardization in the overall customer 360 profile structures.
The solution had to be designed to be resilient to any potential modifications of customer profile data.
Certain objects within the data exhibited complex nested structures, requiring the implementation of a multistep processing system for a seamless data migration.
A considerable portion of the data pertaining to unidentified web users lack comprehensive identification. With only one web-based platform-dependent identifier in use, the redundancy following the migration process posed a significant challenge.
The data had to be extracted through an S3 bucket and were available in the form of complex, multi-level, nested JSON files.
A snapshot of a customer 360 JSON file
A typical customer 360 profile consists of objects based on the attributes’ data types, such as numbers, strings, arrays, Boolean, and date values.
The Solution
After a thorough analysis of platform’s capabilities and restrictions, understanding the customer profile data and mapping the business requirements to the data, a two-stage approach for efficiently transferring the data, was developed.
Stage 1 – Overcoming Platform Challenges
1. Analyzing Intricacies: To handle the complex structures of multi-level, nested JSON files, a comprehensive analysis of all possible unique attributes across all profiles was conducted. Recognizing the necessity for a flexible and efficient approach, metadata tables were meticulously designed to simplify the intricate data structures.
2. Streamlining Processes: Complex key-value pairs within JSON files were precisely converted into a more manageable row-column format. Leveraging a powerful Python-based solution, data integrity was ensured while preserving the delicate relationships between various nested attributes.
3. Innovative Transformation: To implement an innovative strategy, a series of data transformation was orchestrated to unravel the complexities of nested sub-objects. Data migration processes were also streamlined by translating them into organized metadata tables, ensuring a seamless transition to the Salesforce platform.
4. Ensuring Long-Term Scalability: A metadata-driven framework was engineered to enable long-term scalability and adaptability. This framework enabled dynamic adjustments to data transformation process, accommodating future changes or enhancements to the data structure. This metadata-driven approach future-proofed the solution and empowered the organization to navigate potential data intricacies with confidence.
5. Seamless Integration: Once the CSV file output of our Python-based solution became a DLO within the Salesforce platform, it was then meticulously mapped to the relevant DMOs. By curating custom objects and attributes, a complete coverage of customer profile attributes was ensured, enabling a seamless integration process.
Process flow for migrating the data to salesforce through a custom Python script
Stage 2 – Overcoming Data Challenges
Data migration posed a significant challenge due to a non-consistent profile pattern with nested structures within profiles. The data transformation process needed to be resilient and future-proofed. Additionally, profiles of known users had to be separately treated from those of the unknown users.
1. Comprehensive Data Analysis: The approach commenced with a comprehensive analysis of the customer profile data, aiming to identify potential attributes and assess the feasibility of attribute categorization.
2. Understanding Data Scale: To provide insight into the data volume at hand, the string object of the customer 360 profile alone revealed the presence of more than 200 unique keys across 200,000 profiles. Although not all profiles contained as many keys, the solution was designed to capture any of these potential keys whenever they were encountered.
3. Detailed Source-to-Target Mapping: A thorough examination of all objects within the profiles led to the compilation of a list of potential attributes for each object. A distinction was also made between attributes present in all profiles and those that might be missing in specific customer profiles.
Upon identifying and classifying the attributes, transformation logic was established based on business requirements and platform constraints. Standardization of attribute names was achieved through modifications such as name adjustments, removal of bracket, optimization of the length of attribute name, and elimination of blank spaces.
Data mapping between a DLO and a DMO for one of the source profile objects
4. Implementing the Solution: After an extensive review of the source-to-target mapping file, the logic was translated into a Python-based solution capable of processing the data collected through an S3 bucket.
A sample DMO graph illustrating the linkages between various data categories inside Salesforce Data Cloud
Approach for Unknown Profiles
The data migration strategy for both known and unknown profiles followed a similar approach, with an additional focus on identity attributes that are crucial for handling unknown users. This strategy accounted for the future conversion of unknown users to known users, post-migration to Salesforce Data Cloud.
1. Identification Challenge: A significant challenge with unknown web users would be the lack of a consistent unique identifier in customers’ web engagement data that could be seamlessly transferred to Salesforce Data Cloud. As existing users were converted from unknown to known based on a platform ID that was to be discontinued, there was a risk of losing the ability to convert unknown users to known users in the future.
2. Identity Resolution Solution: After thoroughly analyzing the data, a combination of Salesforce ID and Google Analytics Client ID was used for the reidentification of web users.
Google Analytics was already in use as the organization’s website analytics tool, and the Google Analytics Client ID was being captured wherever the web users had consented to it. By integrating the Google Analytics (GA) Client ID, we could effectively track anonymous users and seamlessly link them with Salesforce. This approach eliminated dependency on browser cookies, ensuring consistent tracking even when users switched browsers or cleared their cookies.
A Salesforce ID was then added and passed through a JavaScript-based Salesforce solution called Web SDK (software development kit), which assigned a unique ID to every web user. The Web SDK would also capture the GA ClientID for the user.
This combination of Salesforce ID and GA Client ID helped enable identity resolution in Salesforce Data Cloud and with reidentifying most, if not all, unknown web users.
Testing
The carefully considered deployment of checks and balances at critical junctures within the project’s lifecycle ensured a successful migration. This included two layers of testing – data validation testing and functionality testing.
Data Validation Testing
Migrated data were validated to ensure their accuracy and completeness. Any data discrepancies or data anomalies discovered were promptly addressed. Additionally, duplicates were removed, data formats were standardized, and inconsistencies were resolved. The following key best practices during this stage of testing included:
The review of data mapping and transformation rules
The comparison of migrated data with source data
The validation of data consistency
The cross-referencing of migrated data with original data
Data explorer in salesforce data cloud can act as a data validation tool through row-wise view as well as aggregated view
Functionality Testing
The functionality testing process aimed to evaluate the performance and capabilities of Salesforce Data Cloud after migrating profiles from the database. It involved testing various aspects of the CDP’s functionality to ensure that it aligned with the organization’s marketing and customer engagement requirements.
Test Profile Retrieval: Verify that profiles migrated from the source database can be accurately retrieved from Salesforce Data Cloud.
Verify Segmentation Logic: Ensure that segmentation functionalities in Salesforce Data Cloud work effectively.
Test Personalization Features: Confirm that Salesforce Data Cloud can be used for personalized marketing campaigns.
Evaluate Data Enrichment Results: Assess the data enrichment and augmentation features in Salesforce Data Cloud.
Outcomes
The most significant outcome of this large-scale project was the creation of a foundational layer of customer data in the Salesforce Data Cloud. This enabled
A ready-to-use CDP in record time
The ability for the CDP team to share enriched customer 360 profiles in real-time with the broader marketing team
A data-leakage-proof solution for known customers
A streamlined process to minimize the loss of unknown profiles
Technical Outcomes
The migration of customer profiles and data to Salesforce, powered by an independent Python-based solution, yielded several impactful technical outcomes, enabling the organization to enhance its marketing and customer engagement strategies. The key technical outcomes included are as follows:
Data Consolidation: Migrating customer profiles to Salesforce enabled the consolidation of customer data into a unified platform, streamlining data management, and eliminating data silos.
Data Quality Enhancement: The migration process included data validation and cleansing, resulting in improved data quality and accuracy, which is essential for informed decision-making.
Scalability: Salesforce’s scalability and integration with other Salesforce products allowed the client to onboard more campaign teams to leverage CDP-driven customer 360 profiles.
Business Outcomes
The technology-agnostic migration process led to significant business outcomes, highlighting the efficiency and cost-effectiveness of the solution. These included
Cost Reduction: By consolidating data management and reducing data inaccuracies, the organization has lowered operational costs and mitigated losses associated with marketing inefficiencies.
Enhanced Agility: The migration has fostered a culture of agility and innovation, allowing the organization to rapidly adapt to changing market conditions and customer demands.
The migration to Salesforce not only optimized technical processes but also enabled the organization to continue its marketing campaigns without any negative impact. The confluence of technical and business outcomes has empowered the organization to thrive in the data-driven world of digital marketing.
Conclusion
Although this white paper demonstrates the possibility of efficient data interoperability for semi-structured data, it is possible to perform this exercise for any platform and for any type of data if the objectives are well-defined, the source details are available, and the expectations for the target platform data are clear.
This white paper demonstrates one of the methods through which we simplified customer data preservation while working toward the broader objective of personalization of customer experiences. Reach out to us if you would like to know more about how we enable a solid data foundation to power Customer 360 and CX personalization for our customers.

Authors

Tanya Bisht
Tanya Bisht
Ramu Aitha
Ramu Aitha