Posts Tagged ‘EWSA’

Accelerating Password Recovery: the Addition of FPGA

Tuesday, July 17th, 2012

Back in 2008, ElcomSoft started using consumer-grade video cards to accelerate password recovery. The abilities of today’s GPU’s to perform massively parallel computations helped us greatly increase the speed of recovering passwords. Users of GPU-accelerated ElcomSoft password recovery tools were able to see the result 10 to 200 times (depending on system configuration) sooner than the users of competing, non-accelerated products.

Today, ElcomSoft introduced support for a new class of acceleration hardware: Field Programmable Gate Arrays (FPGAs) used by Pico Computing in its hardware acceleration modules. Two products have received the update: Elcomsoft Phone Password Breaker and Elcomsoft Wireless Security Auditor, enabling accelerated recovery of Wi-Fi WPA/WPA2 passwords as well as passwords protecting Apple and Blackberry offline backups. In near future, Pico FPGA support will be added to Elcomsoft Distributed Password Recovery.

With FPGA support, ElcomSoft products now support a wide range of hardware acceleration platforms including Pico FPGA’s, OpenCL compliant AMD video cards, Tableau TACC, and NVIDIA CUDA compatible hardware including conventional and enterprise-grade solutions such as Tesla and Fermi.

Hardware Acceleration of Password Recovery
Today, no serious forensic user will use a product relying solely on computer’s CPU. Clusters of GPU-accelerated workstations are employed to crack a wide range of passwords from those protecting office documents and databases to passwords protecting Wi-Fi communications as well as information stored in Apple and BlackBerry smartphones. But can consumer-grade video cards be called the definite ‘best’ solution?

GPU Acceleration: The Other Side of the Coin
Granted, high-end gaming video cards provide the best bang for the buck when it comes to buying teraflops. There’s simply no competition here. A cluster of 4 AMD or NVIDIA video cards installed in a single chassis can provide a computational equivalent of 500 or even 1000 dual-core CPU’s at a small fraction of the price, size and power consumption of similarly powerful workstation equipped only with CPU’s.

However, GPU’s used in video cards, including enterprise-grade solutions such as NVIDIA Tesla, are not optimized for the very specific purpose of recovering passwords. They still do orders of magnitude better than CPU’s, but if one’s looking for a solution that prioritizes absolute performance over price/performance, there are alternatives.

 How Would You Like Your Eggs?
A single top of the line video card such as AMD Radeon 7970 consumes about 300 W at top load. It generates so much heat you can literally fry an egg on it! A cluster of four gaming video cards installed into a single PC will suck power and generate so much heat that cooling becomes a serious issue.

Accelerating Password Recovery with FPGAs
High-performance password cracking can be achieved with other devices. Field Programmable Gate Arrays (FPGAs) will fit the bill just perfectly. A single 4U chassis with a cluster of FPGA’s installed can offer a computational equivalent of over 2,000 dual-core processors.

The power consumption of FPGA-based units is dramatically less than that of consumer video cards. For example, units such as Pico E-101 draw measly 2.5 W. FPGA-based solutions don’t even approach the level of power consumption and heat generation of gaming video cards, running much cooler and comprising a much more stable system.

GPU vs. FPGA Acceleration: The Battle
Both GPU and FPGA acceleration approaches have their pros and contras. The GPU approach offers the best value, delivering optimal price/performance ratio to savvy consumers and occasional users. Heavy users will have to deal with increased power consumption and heat generation of GPU clusters.

FPGA’s definitely cost more per teraflop of performance. However, they are better optimized for applications such as password recovery (as opposed to 3D and video calculations), delivering significantly better performance – in absolute terms – compared to GPU-accelerated systems. FPGA-based systems generate much less heat than GPU clusters, and consume significantly less power. In addition, an FPGA-based system fits perfectly into a single 4U chassis, allowing forensic users building racks stuffed with FPGA-based systems. This is the very reason why many government, intelligence, military and law enforcement agencies are choosing FPGA-based systems.

ElcomSoft Half-Switches to OpenCL

Thursday, March 8th, 2012

OpenCL_Artwork

ElcomSoft has recently announced the switch to OpenCL, an open cross-platform architecture offering universal, future-proof accessibility to a wide range of acceleration hardware. We’re actively using GPU acceleration for breaking passwords faster. No issues with NVIDIA hardware, but working with AMD devices has always been a trouble.

So we jumped in, embedding OpenCL support into Elcomsoft Phone Password Breaker and Wireless Security Auditor. As an immediate benefit, we were able to add long-awaited support for AMD’s latest generation of graphic accelerators, the AMD Radeon™ HD 7000 Series currently including AMD Radeon™ HD 7750, 7770, 7950, and 7970 models. Headache-free support for future generations of acceleration hardware is icing on the cake.

OpenCL_Benchmark

After switching to OpenCL, we further optimized acceleration code for AMD hardware, squeezing up to 50% more speed out of the same boards. This isn’t something to sniff at, as even a few per cents of performance can save hours when breaking long, complex passwords.

OpenCL vs. CUDA

AMD goes OpenCL. What about NVIDIA? Technically, we could have handled NVIDIA accelerators the same way, via OpenCL (it’s a cross-platform architecture, remember?) In that case, we would be getting a simpler, easier to maintain product line with a single acceleration technology to support.

However, we’re not making a full commitment just yet. While some of us love open-source, publicly maintained cross-platform solutions, these are not always the best thing to do in commercial apps. And for a moment here, we’re not talking about licensing issues. Instead, we’re talking sheer speed. While OpenCL is a great platform, offering future-proof, headache-free support of future acceleration hardware, it’s still an extra abstraction layer sitting between the hardware and our code. It’s great when we’re talking AMD, a company known for a rather inconsistent developer support for its latest hardware; there’s simply no alternative. If we wanted access to their latest state-of-the-art graphic accelerators such as AMD Radeon™ HD 7000 Series boards, it was OpenCL or nothing.

We didn’t have such issues with AMD’s main competitor, NVIDIA. NVIDIA was the first player on this arena, being the first to release graphical accelerators capable of fixed-point calculations. It was also the first to offer non-gaming developers access to sheer computational power of its GPU units by releasing CUDA, an application programming interface enabling developers use its hardware in non-graphical applications. From the very beginning and up to this day, CUDA maintains universal compatibility among the many generations of NVIDIA graphical accelerators. The same simply that can’t be said about AMD.

So is it the “if it ain’t broke, don’t fix it” approach? Partly, but that’s just one side of the coin. CUDA simply offers better performance than OpenCL. The speed benefit is slight, but it is there, and it’s significant enough to get noticed. We want to squeeze every last bit of performance out of our products and computers’ hardware, and that’s the real reason we’ll be staying with CUDA for as long as it’s supported – or until OpenCL offers performance that can match that of CUDA.

Did we make the switch half-heartedly? Nope. We’re enthusiastic about the future of OpenCL, looking forward to run our software on new acceleration platforms. But we don’t want to abandon our heritage code – especially if it performs better than its replacement!

New sweeping WPA Cracker & its alternatives

Tuesday, December 8th, 2009

It’s a well-know fact that WPA-PSK networks are vulnerable to dictionary attacks, though one cannot but admit that running a respectable-sized dictionary over a WPA network handshake can take days or weeks.

A low-cost service for penetration testers that checks the security of wireless networks by running passwords against a 135-million-word dictionary has been recently unveiled. The so-called WPA Cracker is a cloud-based service that accesses a 400-CPU cluster. For $34, it can run a password against all 135 million entries in about 20 minutes. Want to pay less, do it for $17 and wait 40 minutes to see the results.

Another notable feature is the use of the dictionary that has been set up specifically for cracking Wi-Fi Protected Access passwords. While Windows, UNIX and other systems allow short passwords, WPA pass codes must contain a minimum of eight characters. Its entries use a variety of words, common phrases and "elite speak" that have been compiled with WPA networks in mind.

WPA Cracker is used by capturing a wireless network's handshake locally and then uploading it, along with the network name. The service then compares the PBKDF2, or Password-Based Key Derivation Function, against the dictionary. The approach makes sense, considering each handshake is salted using the network's ESSID, a technique that makes rainbow tables only so useful.

Everything seems to be perfect, but for the fact that there exists another alternative to crack WPA passwords which allows to reach the same speed. Just instead of installing a 400-CPU cluster, it’s possible to set 4 top Radeons or about two Teslas and try Elcomsoft Wireless Security Auditor.

Elcomsoft Wireless Security Auditor: WPA-PSK Password Audit

CUDA-enabled applications

Monday, May 18th, 2009

Tom’s Hardware has tested two mainstream NVIDIA cards (GeForce 9600 GT and GeForce 9800 GTX) on several CUDA-enabled applications. The applications were:

  • SETI@home
  • CyberLink PowerDirector
  • Tsunami MPEG Encoder
  • Super LoiLoScope
  • Badaboom

(more…)