Publications on Bloom Filters

Reconfigurable Network Group (In Reverse Chronological Order)
(Also available as BIBTeX format)

Journals and Conferences

Fast and Scalable Pattern Matching for Network Intrusion Detection Systems, by Sarang Dharmapurikar and John W. Lockwood, IEEE Journal on Selected Areas in Communications, (JSAC) Oct. 2006, Volume: 24, Issue: 10, pp. 1781- 1792

Abstract: High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are required for line-speed packet processing.
We present hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. Our algorithm is based on a memory efficient multi-hashing data structure called Bloom filter. We use embedded on-chip memory blocks in FPGA/VLSI chips to construct Bloom filters which can suppress a large fraction of memory accesses and speed up string matching. Based on this concept, we first present a simple algorithm which can scan for several thousand short (up to 16 bytes) patterns at multi-gigabit per second speeds with a moderately small amount of embedded memory and a few mega bytes of external memory. Furthermore, we modify this algorithm to be able to handle arbitrarily large strings at the cost of a little more on-chip memory. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.

Fast Packet Classification Using Bloom Filters, by Sarang Dharmapurikar, Haoyu Song, Jonathan Turner, John Lockwood, ACM Symposim on Architectures for Networking and Communications Systems (ANCS), San Jose, CA, Dec 4-5, 2006

Abstract: Ternary Content Addressable Memory (TCAM), although widely used for general packet classification, is an expensive and high power-consuming device. Algorithmic solutions which rely on commodity memory chips are relatively inexpensive and power-efficient but have not been able to match the generality and performance of TCAMs. Therefore, the development of fast and power-efficient algorithmic packet classification techniques continues to be a research subject.
In this paper we propose a new approach to packet classification which combines architectural and algorithmic tech- niques. Our starting point is the well-known crossproduct algorithm which is fast but has significant memory overhead due to the extra rules needed to represent the crossproducts. We show how to modify the crossproduct method in a way that drastically reduces the memory requirement without compromising on performance. Unnecessary accesses to the off-chip memory are avoided by filtering them through on- chip Bloom filters. For packets that match p rules in a rule set, our algorithm requires just 4+p+e. independent memory accesses to return all matching rules, where . e << 1 is a small constant that depends on the false positive rate of the Bloom filters. Using two commodity SRAM chips, a throughput of 38 Million packets per second can be achieved. For rule set sizes ranging from a few hundred to several thousand filters, the average rule set expansion factor attributable to the al- gorithm is just 1.2 to 1.4. The average memory consumption per rule is 32 to 45 bytes.

Multi-pattern Signature Matching for Hardware Network Intrusion Detection Systems, by Haoyu Song and John W. Lockwood, IEEE Globecom 2005, St. Louis, MO, Nov. 28, 2005, pp. CN-02-3.

Abstract: Network Intrusion Detection System (NIDS) performs deep inspections on the packet payload to identify, deter and contain the malicious attacks over the Internet. It needs to perform exact matching on multi-pattern signatures in real time. In this paper we introduce an efficient data structure called Extended Bloom Filter (EBF) and the corresponding algorithm to perform the multi-pattern signature matching. We also present a technique to support long signature matching so that we need only to maintain a limited number of supported signature lengths for the EBFs. We show that at reasonable hardware cost we can achieve very fast and almost time-deterministic exact matching for thousands of signatures. The architecture takes the advantages of embedded multi-port memories in FPGAs and can be used to build a full-featured hardware-based NIDS.

Fast and Scalable Pattern Matching for Content Filtering, by Sarang Dharmapurikar, John Lockwood Proceedings of Symposium on Architectures for Networking and Communications Systems (ANCS), Princeton, NJ, Oct 2005.

Abstract: High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. To support a large number of patterns, the throughput is compromised or vice versa.
We present a hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. We modify the classic Aho-Corasick algorithm to consider multiple characters at a time for higher throughput. Furthermore, we suppress a large fraction of memory accesses by using Bloom filters implemented with a small amount of on-chip memory. The resulting algorithm can support matching of several thousands of patterns at more than 10 Gbps with the help of a less than 50 KBytes of embedded memory and a few megabytes of external SRAM. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.

Fast Hash Table Lookup Using Extended Bloom Filter: An Aid to Network Processing, by Haoyu Song, Sarang Dharmapurikar, Jonathan Turner, John Lockwood; ACM SIGCOMM; Philadelphia, PA; August 21-26, 2005.

Abstract: Hash table is used as one of the fundamental modules in several network processing algorithms and applications such as route lookup, packet classification, per-flow state management and network monitoring. These applications, which typically form components of data-path in a high-speed router, must process and forward packets with little or no buffer in order to maintain the wire-speed throughout. A poorly designed hash table can critically affect the worstcase throughput due to multiple memory accesses required for each lookup. Hence, high throughput requirement in turn underscores the need for a hash table having good and more predictable worstcase lookup performance. While most of the existing hash table based packet processing algorithms rely on the assumption that hash table lookup needs constant time, very little discussion is provided on the underlying engineering considerations to achieve this performance.
We present a novel hash table data structure and lookup algorithm which improves the performance of a naive hash table by providing better bounds on the hash collisions and memory accesses per search. Our algorithm extends the multiple-hashing Bloom Filter data structure to support exact matches. We contrive our hash table architecture by coupling our algorithm with the latest advances in embedded memory technology. Through theoretical analysis and simulations we show that our algorithm is significantly faster for practical purposes than the naive hash table using the same amount of memory, hence it can support better throughput for router applications based on hash tables.

Implementation Results of Bloom Filters for String Matching, by Michael Attig, Sarang Dharmapurikar, and John Lockwood; In Proceedings of: IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, April 20-23, 2004.

Abstract: Network Intrusion Detection and Prevention Systems (IDPS) use string matching to scan Internet packets for malicious content. Bloom filters offer a mechanism to search for a large number of strings efficiently and concurrently when implemented with Field Programmable Gate Array (FPGA) technology. A string matching circuit has been implemented within the FPX platform using Bloom filters. Using 155 block RAMs on a single Xilinx VirtexE 2000 FPGA, the circuit scans for 35,475 unique signatures.

Deep Packet Inspection using Parallel Bloom Filters, by Sarang Dharmapurikar, Praveen Krishnamurthy, Todd S. Sproull, John W. Lockwood; IEEE Micro, Vol. 24, No. 1, Jan 2004, pp. 52-61

Abstract: Because conventional software-based packet inspection algorithms have not kept pace with high-speed networks, interest has turned to using hardware to process network data quickly. String scanning with Bloom filters can scan entire packet payloads for predifined signatures at multi-Gigabit-per-second line speeds.

Longest Prefix Matching Using Bloom Filters; by Sarang Dharmapurikar, Praveen Krishnamurthy, David E. Taylor, Proceedings of SIGCOMM 2003, Karlsruhe, Germany, Aug. 25-29, 2003.

Abstract We introduce the first algorithm that we are aware of to employ Bloom filters for Longest Prefix Matching (LPM). The algorithm performs parallel queries on Bloom filters, an efficient data structure for membership queries, in order to determine address prefix membership in sets of prefixes sorted by prefix length. We show that use of this algorithm for Internet Protocol (IP) routing lookups results in a search engine providing better performance and scalability than TCAM-based approaches. The key feature of our technique is that the performance, as determined by the number of dependent memory accesses per lookup, can be held constant for longer address lengths or additional unique address prefix lengths in the forwarding table given that memory resources scale linearly with the number of prefixes in the forwarding table. Our approach is equally attractive for Internet Protocol Version 6 (IPv6) which uses 128-bit destination addresses, four times longer than IPv4. We present a basic version of our approach along with optimizations leveraging previous advances in LPM algorithms. We also report results of performance simulations of our sys snapshots of IPv4 BGP tables and extend the results to IPv6. Using less than 2Mb of embedded RAM and a commodity SRAM device, our technique achieves average performance of one hash probe per lo and a worst case of two hash probes and one array access per lookup.

Deep Packet Inspection Using Parallel Bloom Filters, by Sarang Dharmapurikar, Praveen Krishnamurthy, Todd Sproull, John W. Lockwood; Hot Interconnects 11 (HotI), Stanford, CA, USA, pp. 44-51, Aug. 2003.

Abstract Recent advances in network packet processing focus on payload inspection for applications that include contentbased billing, layer-7 switching and Internet security. Most of the applications in this family need to search for predefined signatures in the packet payload. Hence an important building block of these processors is string matching infrastructure. Since conventional software-based algorithms for string matching have not kept pace with high network speeds, specialized high-speed, hardware-based solutions are needed. We describe a technique based on Bloom filters for detecting predefined signatures (a string of bytes) in the packet payload. A Bloom filter is a data structure for representing a set of strings in order to support membership queries. We use hardware Bloom filters to isolate all packets that potentially contain predefined signatures. Another independent process eliminates false positives produced by Bloom filters.
We outline our approach for string matching at line speeds and present the performance analysis. Finally, we report the results for a prototype implementation of this system on the FPX platform. Our analysis shows that with the state-of-the-art FPGAs, a set of 10,000 strings can be scanned in the network data at the line speed of OC48 (2.4 Gbps).

Related Web Pages

Reconfigurable Network Group Homepage
FPX Homepage