Each night, Bloomberg calculates pricing for 1.3 million hard-to-price asset-backed securities such as collateralized mortgage obligations (including cash flows, key rate duration and such). Since 1996, the market news giant has performed these calculations — single-factor stochastic models based on Monte Carlo simulations — on a farm of Linux servers in its data centers in New York and New Jersey. "These models are ideal for doing things in parallel, and we did parallelize them over traditional x86 Linux computers," says CTO Shawn Edwards.
In 2005, Bloomberg released a more precise two-factor model that calibrates itself to the current volatility surface, but only for for ad-hoc, on-demand pricing, not overnight batch mode. (The previous model ran in both ad hoc and batch mode.) "This model was more expensive to run and we ran it when people asked for it," Edwards says. "These securities are being held in large portfolios, and there was client demand for us to use this better model for our overnight pricing."
In early 2008, Bloomberg considered scaling up its Linux farm to accommodate this customer demand. "It turned out that in order to compute everything within that eight-hour window, we would need to go from 800 cores to 8,000 cores," Edwards. "That's a lot of servers, about 1,000. We could do it, but it doesn't scale very well. If we wanted to use it for other ideas, we were faced with having to pile on more and more computers. That's when the idea came in for GPU computing."
A programmer on Edwards' staff suggested trying to run the models on graphics processing units (GPUs). (GPUs or graphics cards are specialized chips that run inside PCs to display 2D and 3D graphics. They tend to contain hundreds of floating point processors that are good at handling mathematically intensive and parallel processes such as Monte Carlo simulations.) The programmer ran a proof of concept in March 2008 using the cash flow generation part of the algorithm and showed a dramatic increase in performance. That programmer now runs the team of technologists that work on the bond pricing system.
Bloomberg went live in 2009 running its two-factor models on a farm of traditional servers paired with nVidia Tesla GPUs. Instead of having to scale up to 1,000 servers, Bloomberg is using 48 server/GPU pairs.
Bloomberg and nVidia engineers worked together to get the pricing software to run on the GPUs. "The underlying math and algorithms are proprietary to Bloomberg," says Andy Keane, general manager, Tesla supercomputing at nVidia. "We provide training, expertise to make the Bloomberg software GPU-compatible. There's a bit of a wall between the two to protect Bloomberg's intellectual property." Rewriting, restructuring and testing the code to run over the GPUs took about a year. "This service is mission-critical to our customers, they rely on it to make decisions, so we had an extensive testing period," Edwards says.
Part of the pricing application, data gathering, doesn't lend itself well to GPU computing, Edwards notes, because it can't be parallelized. The x86 servers also prepare the problems to be parallelized. But about 90% of the work does run on the GPU platform, he says.
"Overall, we've achieved an 800% performance increase," Edwards says. "What used to take eight hours we're computing in two hours." The GPUs are high speed, running double-precision mathematics at 16 teraflops. (A teraflop is equivalent to a trillion floating point operations per second.) And the firm is a little greener now — the server/GPU pairs consume one-third of the energy 1,000 servers would have required and less data center space is occupied. Cost-wise, the GPU project was equivalent to scaling up the Linux farm, Edwards says.
In the future, Bloomberg plans to run other types of calculations, such as pricing of other types of derivatives and portfolio valuations, on GPUs.
"One of the challenges Bloomberg always faces is that we have very large scale," Edwards says. "We're serving all the financial and business community and there are a lot of different instruments and models people want calculated. This is a nice tool in our toolkit that we're looking to apply in different places."