🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 19 (from laksa009)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📄
INDEXABLE
CRAWLED
24 days ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.8 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://www.cs.fsu.edu/~burmeste/slideshow/chapter12/12-3.html
Last Crawled2026-03-24 05:57:16 (24 days ago)
First Indexed2019-07-09 23:50:02 (6 years ago)
HTTP Status Code200
Meta TitleHash Functions
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
Good hash functions should aim at the assumption of s imple u niform h ashing: each key is  equally likely to hash into any of the m slots. Draw keys , k, from U with probability P (k), so SUH:   � k :h (k) = j P (k) = 1/m , j = 0, ..., m-1 thus all hash values are equally likely This is hard to use as P (k)'s are not usually known (even approximately.) If the k's are U[0,1) and independent then h (k) =  � k m � (m fixed) satisfies SUH. Hash functions are usually  chosen to be as "independent" of patterns that may exist in the k's. Note: SUH is powerful, but often hash functions need to do "better than" just SUH. Sometimes "mixing" and "separation" are required, so that a close k 1 k 2 should be far apart in their hash value. Aside: Why such "separating" hash function? Cryptographic verification of signatures. My Will  file � hash �� digest � sign �� certificate What if my brother could change my will slightly (add the word "not") and not change the digest? Bad news for me! Even though keys may not be integers, lets consider them to be integers. Hash Functions h (k) = k ( mod m ) What are good choices for m? m � 2 p m � 2 p � c (close to a power of 2) m � 10 p (especially with decimal keys) m = prime is a good choice. Note: If your U is well known, you could try to experimentally optimize m. This is called the division method since: k ( mod m ) = k - m ( � k/m   � ) Multiplication Method h (k)  =  � m ( k A (mod 1) )   � A is a real number 0 < A < 1 m is usually an integer ( 2 p ) k A (mod 1) = k A -  � k A   � Example: m = 2 p , k x  � A  2 w   � = 2 w r 1 + r 0 2 p ( 2 w r 1 + r 0 ) = 2 w+p r 1 + r 0 2 p , the p   m.s.b.s here are h (k) w   k x     � A  2 w   � r 1 p r 0 multiplying by 2 p shifts  r 0 by p bits (creates an integer our of the p m.s.b.s of  r 0 ) What choices of A re best? A: A should be irrational. What irrationals are the most irrational?  A: A = a + 1                        b + 1                             c + 1                                    c + ... with repeating continued fractions (solutions to quadratic equations). Universal Hashing: With a fixed hash function there are keys that will hash poorly. We previously solved bad worst case behavior by randomizing into average cases: choose your hash functions randomly! This is called Universal Hashing, to do this we must construct a family of hash functions to choose from. H is such a family. h � H h: U � { 0,..., m-1 } if for each pair x, y � � U # h ' h (x) = h (y) is | H | / m . This means that h � H randomly chosen will give h (x) = h (y) (collision) with probability 1/m. This means that on the average (with regard to the functions in H) we get SUH. Theorem : Let h � H. We hash n keys into a table of size m, n � m. Then the number of expected collisions for a key x is less than one. Proof : c yz = { 1 if h (y) = h (z), 0 otherwise } E [ c yz ] = 1/m (because h was chosen randomly). c x = total number of collisions with x in T of size m with n keys. E [ ( c x ) ]  = � y � T E ( ( c xy ) ] = ( n-1 ) ( 1/m ) assumptions: y � x and ( n- 1 ) ( 1/m ) < 1 since n � m. How can we design H a universal class? . Example: |T| = m, m prime. x = <x 0 , x 1 , x 2 , ...,x r > Bytes, Max value Byte < m <a 0 , a 1 , a 2 , ...,a r > a i randomly chosen from {0, 1, 2, ..., m-1} h (x) = r � i = 0   a i x i ( mod m ) H = U a {h a }, has m r+1 members. Theorem: The class H is a universal class. Proof: Consider x, y, can assume x 0 � y 0 . With {a 0 ,  ...,a r } given:  a 0   (x 0 - y 0 ) �     r --   � i = 1 a i  (x i - y i )   ( mod m ) Has only one a 0 that solves it. (write down h (x) = h (y) for a 0 ).   Since m is prime:  a 0   �    r - � i = 1   a i  (x i - y i ) (x 0 - y 0 ) -1   (mod m) This means  a 0 can be found to cause a collision each time. There are thus m r different collisions here, one for each of the m r choices of <a 1 , a 2 , a 3 , ...,a r >. Since there are m r+1 , <a 0 , a 1 , a 2 , ...,a r >'s, x and y collide with probability m r /m r+1 = 1/m �� H is Universal. An aside on modular inversion: a -1 ( mod m ) is the integer that solves: a a -1 = 1 ( mod m ) For a to have an inverse ( mod m ) it must be that : gcd ( a, m ) = 1, i.e. a and m have no common factor. One computes gcd ( a, m ) via the Euclidean algorithm ( will analyze this later this term.) A variant called the Extended Euclidean Algorithm: given a, m produces gcd ( a, m ) = xa + ym. If gcd ( a, m ) = 1 the x = a -1 ! Method 2: If m is prime, then a m-1 �� 1( mod m ) for any a. (This fact is the basis for probabilistic primality testing.) thus a a m-2 � 1 ( mod m ) and a -1 � a m-2 ( mod m ). If modular multiplication is Q (1) , then what is the cost of modular exponentiation? a 3    =   a 11  =   (a 2 ) a      ( s - m ) a 4    =   a 100   =  (a 2 ) 2        ss a 5    =   a 101   =   (a 2 ) 2 a    s ( s - m ) a 10110   =  a 22   = (((a 2 ) 2 a) 2 a) 2 So starting from the next to the m.s. bit and working towards the l.s. bit of  the exponent, when you come to a '0' �� square, and when you come to a '1' � square-multiply. Why does this work? Induction: a 1   = a a 10 = a 2 a 11 = a 3 Assume a p is correct q  =  2 p + 1 � square-multiply     =  2 p   � square Cost:  Q ( lg p ) operations. Note: this is the "giant-step" algorithm.
Markdown
| | | | | |---|---|---|---| | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/chapter.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/index.html) | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/slide.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/chapter12/toc.html) | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/prev.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/chapter12/12-2.html) | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/next.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/chapter12/12-4.html) | | | | | | |---|---|---|---| | | | | | | | | | | | �*k* :h (k) = j | P (k) = 1/m , j = 0, ..., m-1 | | | | | | | | | w | | k | | | x | | � A 2w � | | | | | | | | r1 | p | r0 | | | | | | | | E \[ ( cx) \] = | | | | | | | | | | �y �T | E ( ( cxy) \] = ( n-1 ) ( 1/m ) | | | | | | | | | h (x) = | | | | | | | | | | | | | | | *r* �*i* = 0 | ai xi ( mod m ) | | | | | | | | | a0 (x0 - y0 ) � | | | | | | | | | | *r* -- �*i* =1 | ai (xi \- yi ) ( mod m ) | | | | | | | | | a0 � | | | | | | | | | | *r* -�*i* = 1 | ai (xi - yi ) (x0 - y0 )\-1 (mod m) | | | | | | | | | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/prev.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/chapter12/12-2.html) | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/next.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/chapter12/12-4.html) | [![](https://www.cs.fsu.edu/~burmeste/slideshow/images_system/top.gif)](https://www.cs.fsu.edu/~burmeste/slideshow/chapter12/12-3.html#Top_of_Page) | **Hash Tables** - 4 of 5 |
Readable Markdownnull
Shard19 (laksa)
Root Hash2399862591257072619
Unparsed URLedu,fsu!cs,www,/~burmeste/slideshow/chapter12/12-3.html s443