2031 Naive Analysis of Linear Probing

USS Code 39 Creator In JavaUsing Barcode creation for Java Control to generate, create USS Code 39 image in Java applications.

To estimate the performance of linear probing, we make two assumptions:

Barcode Creation In JavaUsing Barcode drawer for Java Control to generate, create barcode image in Java applications.

The simplistic analysis of linear probing is based on the assumption that successive probes are independentThis assumption is not true and thus the analysis underestimates the costs of searching and insertion

Reading Barcode In JavaUsing Barcode recognizer for Java Control to read, scan read, scan image in Java applications.

1 the hash table is large, and 2 each probe in the hash table is independent of the previous probe

Encoding Code 39 Extended In Visual C#.NETUsing Barcode encoder for .NET Control to generate, create Code 39 Extended image in Visual Studio .NET applications.

Assumption 1 is reasonable; otherwise, we would not be bothering with a hash table Assumption 2 says that, if the fraction of the table that is full is A, each time we examine a cell the probability that it is occupied is also 2, independent of any previous probes Independence is an important statistical property that greatly simplifies the analysis of random events Unfortunately, as discussed in Section 2032, the assumption of independence is not only unjustified, but it also is erroneous Thus the naive analysis that we perform is incorrect Even so, it is helpful because it tells us what we can hope to achieve if we are more careful about how collisions are resolved As mentioned earlier in the chapter, the performance of the hash table depends on how full the table is Its fullness is given by the load factor

Paint Code 3 Of 9 In .NET FrameworkUsing Barcode generation for ASP.NET Control to generate, create Code 39 Full ASCII image in ASP.NET applications.

DEFINITION: The load factor, A, of a probing hash table is the fraction of the table that is full The load factor ranges from 0 (empty) to 1 (completely full)

Making ANSI/AIM Code 39 In .NET FrameworkUsing Barcode encoder for .NET framework Control to generate, create Code 39 Extended image in .NET framework applications.

We can now give a simple but incorrect analysis of linear probing in Theorem 20 I

Creating Code 39 In VB.NETUsing Barcode creator for Visual Studio .NET Control to generate, create Code 3/9 image in .NET framework applications.

The load factor of a probing hash table is the fraction of the table that is full It ranges from 0 (empty) to 1 (full)

ANSI/AIM Code 128 Generator In JavaUsing Barcode generation for Java Control to generate, create Code 128C image in Java applications.

/findependence of probes is assumed, the average number of cells examined in an insertion using linear probing is 1 / ( 1 - A)

USS-128 Generation In JavaUsing Barcode maker for Java Control to generate, create EAN 128 image in Java applications.

Theorem 201

Creating Code 39 Extended In JavaUsing Barcode printer for Java Control to generate, create Code 39 Extended image in Java applications.

For a table wit/ cr load factor of A, the probability of any cell's being empty is 1 - A Consequently, the expected number of independent trials required to,find an empty cell is 1/( 1 - A)

Print Bar Code In JavaUsing Barcode creation for Java Control to generate, create barcode image in Java applications.

Proof

Encoding DataMatrix In JavaUsing Barcode creation for Java Control to generate, create DataMatrix image in Java applications.

In the proof of Theorem 201 we use the fact that, if the probability of some event's occurring is p, then on average Ilp trials are required until the event occurs, provided that the trials are independent For example, the expected number of coin flips until a heads occurs is two, and the expected number of rolls of a single six-sided die until a 4 occurs is six, assuming independence

USD - 8 Creator In JavaUsing Barcode encoder for Java Control to generate, create Code 11 image in Java applications.

Quadraticorobino O

Data Matrix 2d Barcode Printer In VS .NETUsing Barcode generation for ASP.NET Control to generate, create DataMatrix image in ASP.NET applications.

dflA

Encoding Data Matrix ECC200 In Visual Basic .NETUsing Barcode encoder for Visual Studio .NET Control to generate, create ECC200 image in .NET applications.

Illustration of primary clustering in linear probing (b) versus no clustering (a) and the less significant secondary clustering in quadratic

EAN-13 Supplement 5 Reader In .NET FrameworkUsing Barcode recognizer for .NET Control to read, scan read, scan image in VS .NET applications.

probing (c) Long lines represent occupied cells, and the load factor

Code 39 Full ASCII Generation In VB.NETUsing Barcode generator for .NET framework Control to generate, create USS Code 39 image in VS .NET applications.

is 07

Generating EAN 128 In .NET FrameworkUsing Barcode generator for ASP.NET Control to generate, create GTIN - 128 image in ASP.NET applications.

2032 What Really Happens: Primary Clustering

Data Matrix ECC200 Recognizer In .NET FrameworkUsing Barcode decoder for .NET framework Control to read, scan read, scan image in .NET framework applications.

The effect of primary clustering is the formation of large clusters of occu~ied cells, making insertions into the cluster expensive (and then the insertion makes the cluster even larger)

Bar Code Maker In C#Using Barcode encoder for Visual Studio .NET Control to generate, create bar code image in VS .NET applications.

Unfortunately, independence does not hold, as shown in Figure 205 Part (a) shows the result o<filling a hash table to 70 percent capacity, if all successive probes are independent Part (b) shows the result of linear probing Note the group of clusters: the phenomenon known as primary clustering In primary clustering, large blocks of occupied cells are formed Any

key that hashes into this cluster requires excessive attempts to resolve the collision, and then it adds to the size of the cluster Not only d~ items that collide because of identical hash functions cause degenerate performance, but also an item that collides with an alternative location for another item causes poor performance The mathematical analysis required to take this phenomenon into account is complex but has been solved, yielding Theorem 202

Theorem 202

The average number of cells examined in an insertion using linear probing is roughly ( 1 + 1/( 1 - i1)2)/2

Proof

The proof is beyond the scope of this text See reference [6]

Linear Probing

For a half-full table, we obtain 25 as the average number of cells examined during an insertion This outcome is almost the same as what the naive analysis indicated The main difference occurs as A gets close to 1 For instance, if the table is 90 percent full, A = 09 The naive analysis suggests that 10 cells would have to be examined-a lot but not completely out of the question However, by Theorem 202, the real answer is that some 50 cells need to be examined That is excessive (especially as this number is only an average and thus some insertions must be worse)

Primary clustering is a problem at high load factors For halfempty tables, the effect is not disastrous