https://blog.tensorflow.org/2019/04/the-power-of-building-on-accelerating-platform-deep-variant.html?hl=zh_CN

Community
**·**
Genomics

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwvgsFzKuEwY4Icf7PRxrtzF3XQVdI8HPrRWffKe7BvbM7DIaYe0qZOrssAc52jTePss8_sWGRY9zX8NPX3V_do8_oOBPAE0adzlbEHmrB8jqbF-d-wC80NAHMG3otnwU9Lg8FFgtnIXM/s1600/fig1.png

四月 30, 2019 —
*Posted by Andrew Carroll, Lizzie Dorfman*

Editorial Note: This post is published with identical content on the Google DeepVariant blog.

The Genomics team at Google Brain develops DeepVariant, an open-source tool for analyzing the DNA sequence of individuals. DeepVariant is built on top of TensorFlow.

The prior release (v0.7) of DeepVariant featured a three-fold improvement in end-to-end speed and …

The Power of Building on an Accelerating Platform: How DeepVariant Uses Intel’s AVX Optimizations

Editorial Note: This post is published with identical content on the Google DeepVariant blog.

The Genomics team at Google Brain develops DeepVariant, an open-source tool for analyzing the DNA sequence of individuals. DeepVariant is built on top of TensorFlow.

The prior release (v0.7) of DeepVariant featured a three-fold improvement in end-to-end speed and a corresponding decrease in cost relative to the previous version (v0.6). Much of this speed improvement comes by enabling DeepVariant to take advantage of new Intel® Advanced Vector eXtensions (AVX-512) instruction set. The use of AVX-512 illustrates how latest advances in systems and platforms help accelerate and impact applied research.

You can think of this like an active transit terminal, with buses constantly leaving to different destinations. The terminal must decide which buses to send in order to maximize the people on each bus without any of them waiting too long to get a ride.

The size of the register and the ability to fill it meaningfully limits the rate of processing. The AVX-512 instruction set is a specially designed way of packing this register to allow a processor to operate on more information at once. In the analogy, it corresponds to having larger buses for common destinations. See the Intel documentation for further details.

The processes for training and applying deep learning models in TensorFlow involve a great deal of the same types of vector/matrix operations. The Intel Math Kernel Library for Deep Neural Networks developed by Intel allows TensorFlow applications to efficiently fill these larger registers, which in turn allows Intel CPUs to compute faster for these applications.

Figure 1. Simplified conceptual schematic of the advantage of AVX-512 in TF-MKL |

In the prior version of DeepVariant (v0.6), the call_variants step required 11 hours and 11 minutes on a 64-core machine with an Intel Skylake CPU. For the newest release of DeepVariant (v0.7), we optimized DeepVariant’s use of TensorFlow to fully take advantage of the new TF-MKL libraries. By tapping these innovations, the runtime for call_variants when executed on the same 64-core machine is reduced to 3 hours and 25 minutes.

Because DeepVariant v0.6 already used some optimizations from the Intel libraries, we decided that to fairly quantify the acceleration from TF-MKL, we should rebuild the exact same version of DeepVariant (v0.7) with a TensorFlow library that included none of the optimizations. On the same 64-core machine, call_variants requires 14 hours and 7 minutes in this configuration.

Figure 2. Runtime Speedup in DeepVariant v0.7 by Using TF-MKL |

This speed improvement, alongside other optimizations we hope to detail later, have a corresponding decrease in cost. In benchmarks on a typical coverage whole genome, DeepVariant costs $2-$3 to run on Google Cloud Platform in DeepVariant v0.7, a three-fold reduction in cost compared to DeepVariant v0.6.

Furthermore, the power of AVX-512 has improved substantially with new generations of Intel CPUs. The new Intel Skylake CPUs executes DeepVariant faster than Intel Broadwell CPUs. Since we expect this trend to continue as parallelization of vector operations becomes more important, we also expect DeepVariant’s speed to naturally improve in the next generations of Intel CPUs, even beyond the speed improvement that would be seen for general applications.

This is the advantage of building on platforms undergoing rapid improvement (whether hardware, software, or the technology in between). This is why the intersection of genomics and informatics is such an exciting place to be. As our capability to sequence genomes grows exponentially, the hardware and software to analyze these genomes must rise to meet it.

Next post

Community
**·**
Genomics
**·**

The Power of Building on an Accelerating Platform: How DeepVariant Uses Intel’s AVX Optimizations

四月 30, 2019
—
*Posted by Andrew Carroll, Lizzie Dorfman*

Editorial Note: This post is published with identical content on the Google DeepVariant blog.

The Genomics team at Google Brain develops DeepVariant, an open-source tool for analyzing the DNA sequence of individuals. DeepVariant is built on top of TensorFlow.

The prior release (v0.7) of DeepVariant featured a three-fold improvement in end-to-end speed and …

Build, deploy, and experiment easily with TensorFlow