Jodi Dey
Nagesh Sarma, MS
Managing Director of Validation Services, Valiance Partners

Why today’s practices are no longer best practices.

Information assets are the lifeblood of day-to-day business. They are also a huge source of institutional value and intellectual property that must be preserved, extended, and repurposed as the systems underpinning them change. In today’s business context, the task of migrating data and content out of obsolete, often disparate, sources
click to enlarge 

Above is a visual representation—at high-level—of a methodology for implementing and validating a migration of data or content into a “GxP” system or application. (Source: Christian Pease, Valiance Partners)
into new platforms has strategic, as well as tactical implications. The task of data and content migration in life sciences businesses is often further complicated by the need for compliance with regulations specified by the US Food and Drug Administration (FDA), and further guided by industry and standards organizations.

For example, FDA regulation 21 CFR § 11.10(a) (Part 11) requires that certain computerized systems be validated as fit for their intended use before they can be put into service. Generally, these regulations govern information technology (IT) systems associated with “good practices,” such as good clinical practices (GCP), good laboratory practices (GLP), and good manufacturing practices (GMP). Collectively, such systems are called “GxP” systems.

Computer Systems Validation (CSV) is a process whereby certain procedures are followed and documents produced to establish and present evidence of the quality and effectiveness of GxP systems. In addition to FDA regulatory requirements, industry also provides direction through guidance documents such as GAMP (good automated manufacturing practices), ISPE (a professional organization formally known as the International Society of Pharmaceutical Engineers), and technical standards such as ANSI/ASQ (American National Standards Institute/American Society for Quality). When existing data or unstructured content such as documents are moved out of one or more IT sources and into a GxP system, the methods used to perform the migration, as well as migration results, must be validated.

This article explores the growing importance of data and content migration and explains why long-standing migration and migration validation practices are no longer adequate to meet business and regulatory requirements. It also offers an alternative approach to implement, test, and validate migrations.

Growing importance of data and content migrations
In today’s environment, characterized by streamlining and standardization, as well as increased consolidation from mergers and acquisitions, there are many good reasons to optimize IT infrastructures, but in doing so, businesses must be careful to maintain the consistency of information needed to support ongoing operations. This information includes content and data that may span generations of technology.

Given the current business environment, migration has taken on strategic importance for a number of reasons. First, the benefits of rationalizing processes and systems hinges on outcomes where services and functionality are as good or better than before, while free of costly redundancies or unnecessarily diverse technologies. Poorly-executed migrations diminish these returns and fail to meet the “as good as or better than before” criteria.

click to enlarge 

This figure illustrates that where the goal is to achieve a 0% error rate. The closer you come to 0% as an acceptable error rate, the closer the sample required for testing approaches all or 100% of the data or content. (Source: Christian Pease, Valiance Partners)
Second, reengineering and consolidation programs are complex, and should be implemented in the correct priority only when needed. Unless migration methods and tools can accommodate this, projects will tend to be overly costly and time-consuming.

Third, today, the growing severity of consequences associated with noncompliance extends beyond FDA statutes to include other local or regional governmental statutes, such as those involving privacy and governance. Therefore, the precise use of legacy data in consolidated applications demand that migration efforts adhere to a broader view of regulatory compliance.

Finally, trends in data quality management and growing business requirements for increasingly higher levels of accuracy now compel reconsideration of de facto approaches to migration testing and validation, such those that rely on sampling techniques.

As a consequence, migration, migration validation, and compliance must be addressed as a whole. Doing so will result in an approach that is repeatable, precise, and determinative. But first, it is important to examine current practices in order to explain new alternatives.

Validated migrations crude by any measure
The art and science for assuring the quality of mission-critical data migrations is obsolete with respect to contemporary business challenges, compliance benchmarks, and available technologies, to the extent that current practices can no longer be considered best practices. This is clearly illustrated in the de facto standard for testing data migrations, which is based on sampling techniques originating in high-volume manufacturing and defined by the ANSI/ASQ Z1.4-2003 specification (Sampling Procedures and Tables for Inspection by Attributes).

Information systems are a poor fit for this testing method. Purpose-build applications maintain data over long periods of time, are subject to periodic upgrades, and often contain imports from other decommissioned systems first developed to meet significantly different requirements. Therefore, the repeatable processes and homogenous qualities upon which sampling theory is based, and where error is assumed to be uniformly distributed across a data set or volume of unstructured content, are absent. Compounding these problems is the larger issue of migration and migration methodology, which is best exemplified by the stark differences between the implementation of enterprise software and migration projects used to populate them with preexisting information.

Large-scale solutions development is multifaceted, labor-intensive, and involves a variety of specialized skills. To manage this complexity, many consultants, even in-house development teams, boldly promote their own proprietary methodologies, but then generally follow the same five widely accepted steps: specify requirements, design, build, test, and deploy.

However, there is no commonly accepted and repeatable counterpart for data migration. Instead, migrations are simply afterthoughts. They are viewed apart from the mainline development efforts associated with the systems they are used to populate, and more often than not, the migration task is not even considered until the target application is nearing completion.

Sampling: time for a closer look
With the prospect of actually performing a migration, development teams typically consider one of a handful of generally accepted methods for testing and verifying results. In addition to sampling,
click to enlarge 

This figure depicts how data and content sources can contain information assets (data and/or content) that was previously migrated into them as the result of a merger or acquisition, or from a system from another company that has been acquired, as well as a source such as old applications built and used from the start by the company doing the migration. (Source: Christian Pease, Valiance Partners)
these are manual inspection, which examines the results of a migration, and process checking, which, unlike sampling and inspections, focuses on verifying that the tool or script used to move the data works as intended.

Manual inspections typically fall short of 100% verification. This is largely because the real-world ability to perform them rapidly deteriorates in the face of large bodies of data and content, tight schedules, and a lack of available personnel.

Consider these aspects of sampling per the ANSI/ASQ standard. It is wholly based on the notion of testing “lots” or “batches,” where similar products are manufactured using a repeatable process. Consequently, sampling depends on the specific characteristics of a given product; but data is not a product—it is the byproduct of a computerized system. And, in a migration, the entire data set and/or volume of content is the lot.

The term data migration engenders little in the way of imaginative thinking or the notion of strategic vision. This is unfortunate and perhaps deceptive, especially in a larger, emerging context. Revenue streams are acquired and divested, as well as created. They must be closely managed to preserve the margins necessary to fund continued research and discovery.

The re-engineering, standardization, and simplification of processes and infrastructure, including compliant information systems and GxP applications, is an important factor for doing so. Now, a fresh and evolving perspective, incorporating new deterministic methods and automated tools, together with an insistence that entire bodies of data and content can and should be addressed, is causing industry to take another look at the scope and purpose of the valid migration task.

This article was published in Drug Discovery & Development magazine: Vol. 10, No. 2, February, 2007, pp. 28-31.