Using Software to Enhance Healthcare

Posted In: Information Tech

By Microsoft

Tuesday, October 12, 2010


newsvine diigo google
slashdot
Share
Loading...

Researchers at Johnson & Johnson Pharmaceutical Research and Development (J&J PRD) faced a challenge. Over the years, they have built a state-of-the-art platform to enable discovery of small-molecule drugs, but the expanding role of biologics in pharmaceutical research required a new set of tools to handle large-molecule compounds.

Developing such functionality from scratch was a daunting proposition. It would take time and resources while delaying development of novel treatments for debilitating diseases and disorders.

Researchers at Microsoft Research had a solution. Their new, open-source library of bioinformatics functions, the Microsoft Biology Foundation (MBF), part of the Microsoft Biology Initiative, was designed to address just such a challenge. When the J&J PRD researchers learned about this, they immediately became intrigued.

This confluence of need and opportunity occurred in late November 2009. Now, less than a year later, the benefit has become manifestly apparent. Instead of spending costly time building a foundation for the new biological infrastructure, J&J PRD was able to focus on delivering value-added functionality needed to facilitate development of innovative treatments that have the potential of improving the health and quality of life of patients around the world.

By using MBF, we were able to provide our users with a greater level of functionality in less time to our users for our initial development phase in the large-molecule space. says Jeremy Kolpak, J&J PRD senior analyst, who will be discussing his teams MBF deployment during the 2010 eScience Workshop, being held in Berkeley, Calif., from Oct. 11-13, It allowed us to focus on value-added functionality for our scientists and has helped us adapt to new requests quite easily.

Such testimony brings a smile to the face of Simon Mercer, director of Health and Wellbeing for External Research, a division of Microsoft Research.

Simon Mercer
Simon Mercer

The principal advantage of MBF, Mercer says, is that, because its free and open-source, as a programmer, you get a certain amount of prewritten functionality that you can just build on top of. It gives you more time to do the real science, because weve already supplied the basics.

It didnt take long for J&J PRD to grasp the implications of MBF.

We were in the process of developing our own infrastructure to work with sequences, Kolpak explains. This was part of a larger move in our organization to improve how R&D with large molecules was performed and integrate that process with an existing and mature framework for working with small molecules.

We have been using MBF from the day we heard of it.

That is precisely the focus of the Health and Wellbeing effort within External Research: to collaborate openly with the bioinformatics community by applying advanced computing technologies to provide unprecedented insight into disease and human healthcare.

MBF, built on the Microsoft .NET Framework and aimed at making it easier to implement biological applications on the Windows platform, was launched in Boston on July 9 during the 11th annual Bioinformatics Open Source Conference. Since then, thousands of bioinformaticians have downloaded the tool kit.

Microsoft Research Biology Extension for Excel
The Microsoft Research Biology Extension for Excel, displaying the contents of a FASTA file containing an Influenza A virus sequence.

There are a lot of biologists who start as post-docs but dont end up going into biological research themselves, Mercer says. They end up managing the data and writing the scientific applications that the biologists need to do research. They can be anywhere on the continuum between full biologists with no computing background to full computer scientists with little or no biological background.

They work alongside the biological scientists, but they wont necessarily be those scientists. Theyll write scripts and write programs to help the lab run, and theyll also probably do some data analysis.

Companies and academics that pursue such work, naturally, are more concerned with the value they can derive from using software tools than with building the tools themselves.

Ive heard it over and over again from executives of different pharmaceutical companies, Mercer says. Possibly 90 percent of their software stack has been developed in house but offers them no competitive advantage. The real crown jewels in bioinformatics are relatively small compared with the huge bulk of software they have to maintain.

Theyre often in a situation where they want to exchange data with other pharmaceutical companies on a pre-compete level, and they find that hard, because their processing pipelines are uniquely their own. A lot of commercial companies are looking for things like MBF to adopt as a common platform, so they are using the same tools, analyzing the data in the same way, and they are able to share data sets and cut costs.

In other words, MBF helps make bioinformaticians work a bit simpler. That certainly appears to be the case at J&J PRD.

We have integrated it into our data-analysis and -visualization platform, Third Dimension Explorer, which has been developed in house, Kolpak says. This platform is used in a multitude of different contexts.

With regard to J&J PRDs large-molecule exploration, he lists the ability to achieve five distinct tasks:

  • View sequences with their associated assay data to see how variations across compounds impact targets.
  • Align multiple sequences.
  • View aligned sequences and their associated metadata, such as complementarity-determining regions.
  • Extract and translate regions of sequences.
  • Work with sequences of different formats to provide a generic platform for scientists to import and analyze them in one place.
Third Dimension Explorer with Johnson & Johnson Pharmaceutical Research and Development's sequence viewer extension
The Third Dimension Explorer sequence viewer extension enables users to view data in different forms and correlate it directly to the sequence where the data originated. The table contains the sequence data, and the top view shows the aligned sequences, color-coded by hydrophobicity. The views on the right are examples of visualizations of the assay data in the table.

The goal, Kolpak says, is to capture operations that are performed routinely and make it extremely efficient to execute in one place. But at the same time, we are not trying to replace existing sequence-analysis tools for the more complex and less used operations.

At Johnson & Johnson Pharmaceutical R&D, there are hundreds of users of the Third Dimension Explorer tool. The MBF-related development is still being completed and rolled out, but 40 people already are using the enhanced data-analysis platformand deriving significant benefits.

Its hard to quantify the amount of time it has saved us, Kolpak says, due to the fact we work with an agile development methodology and, for each iteration, we are finding new functionality in MBF that we can utilize. I would say that, for our initial rollout, which required a large amount of framework implementation, it saved us around three months during a six-month initial development cycle.

Biological work might not be the first thing that comes to mind when people think about Microsoft, but it supports such scientists nevertheless.

Basic Local Alignment Search Tool query results displayed in the Microsoft Research Sequence Assembler
The Microsoft Research Sequence Assembler presents the results of a Basic Local Alignment Search Tool query for the current sequence in Silver Map, a visual control developed by the Queensland University of Technology.

Inside Microsoft Research, weve done lots of biology, Mercer says. Its not what everybody would expect, but a lot of researchers apply their computer-science research in the biological domain for healthcare. How can you apply Microsoft technologies to scientific research? We often do that through collaborations with academics, where the academic brings the biology, in this case, and Microsoft brings the computer science. Together, hopefully, we advance further than either side would have done independently.

Eventually, you have to ask yourself the question, Why dont we just build a platform so that all of the common elements are written once and dont need to be written again for every single project? And once that platform exists, and its open-source and free, why not give it away to the community so it can benefit?

There are specific ways in which MBF can assist in the biological domain, such as with modularity, extensibility, and code maintenance.

Those sorts of things that professional programmers think of arent necessarily the first things in the minds of those who are writing scripts to support a lab, Mercer continues. MBF sits in the middle, with prewritten functionality in nice, digestible chunks, very standardized.

There are quite a few other biological libraries akin to MBF already in use, some of them for a decade or more. But over time, they have grown unwieldy, making it hard to extend them. And they tend to be written in script-based languages that have no type checking. MBF, on the other hand, offers type checking and guarantees, and its built atop the common-language runtime, providing the flexibility to handle any of the more than 70 languages that work with .NET, thereby making easy for a heterogeneous community to use without having to conform to a single language.

Weve also wrapped the individual bits of MBF as workflow activities for our Trident workflow workbench, Mercer adds, which is also free and downloadable. You dont even have to be a programmer to use MBF. You can just drag and drop and connect the building blocks together to build workflow pipelines.

External Research attempts to understand the precise scientific challenge encountered by its MBF partners, a methodology termed scenario-based development that identifies areas where MBF can be made more useful. That methodology will be a key component of the next wave of the tools enhancement.

DNA sequences displayed in the Microsoft Research Sequence Assembler
The Microsoft Research Sequence Assembler displays a series of short DNA sequences assembled by the Parallel De Novo Assembler algorithm into a contig.

Were approaching our partners in the academic community and the commercial world to define those scenarios, Mercer says, and thats whats driving the direction in MBF v2. We encourage the wider communitypeople who download the source code, understand it, and start developing their own extensions to support their own scienceto participate, because the more of those we get, the more broadly we can develop MBF. It will grow by the actions of the community, to support the science that the community wants to support.

That, in the example of J&J PRD, is exactly what is happening.

A lot of what is on our wish list we have been developing in stride, Kolpak says, mainly a visualization tool for viewing sequences, in addition to some other sequence file-format supports that contain more than just sequence data. These are all things we plan to contribute back to the MBF development.

And the community at which MBF is focused expects to use open-source code.

If we want to run a project that would be recognizable and familiar in form to the academic community, Mercer says, then that would be a software-development project that is open-source, because open-source is a very common model there. We want to get contributions from as broad a set of people as possible.

We want scientists to get a value out of using Windows, he concludes. We want scientists to pick up different tools that we have and understand that they can help them do their research more effectively and reach insights more quickly than they would otherwise manage to do. Weve got a lot of value to offer in that area.

The folks at Johnson & Johnson Pharmaceutical Research and Development couldnt agree more.

I am a software developer by trade, Kolpak says, and by using MBF, I have the confidence that what I am providing our users is not just solid code, but also that the science behind it is accurate.

SOURCE

0 Comments

blog comments powered by Disqus

New To Market

more

JEOL to launch world's smallest solid-state NMR probe
JEOL to launch world's smallest solid-state NMR probe

According to JEOL Resonance, a new benchmark for resolution and benchmark will be set with its introduction next week of a new 0.75-mm solid state nuclear magnetic resonance (NMR) probe. The probe is capable of high resolution sample analysis by spinning the sample at 110 kHz, the world's fastest spinning speed for NMR.

Energy Harvesting Subsystems for Wireless Sensors

Nextreme Thermal Solutions has developed two new energy harvesting subsystems for the plumbing and HVAC industries. The subsystems are the latest additions to Nextreme's Thermobility energy harvesting platform that uses thin-film thermoelectric technology to convert available thermal energy into electric power for a variety of autonomous self-powered applications.

Tools & Technology

more

Microscope System with LED Illumination
Microscope System with LED Illumination

Leica Microsystems has introduced the Leica DM4000 B LED, a microscope system with LED illumination suited for biomedical applications.

Liquid Handler

Gilson Inc. has introduced the GX-241 liquid handler, a compact liquid handler suited for application and laboratories where bench space is at a premium.

Advertisement

Advertisement

Top Stories and Headlines
EVERY DAY!

FREE Email Newsletter