chd

package module
v0.0.0-...-4c9dd65 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 13, 2026 License: MIT Imports: 15 Imported by: 1

README

CHD Minimal Perfect Hash

Package chd implements the compress, hash, and displace (CHD) minimal perfect hash algorithm described in Hash, displace, and compress by Botelho et al. It provides a map builder that manages adding of items and map creation.

Installation

go get github.com/robskie/chd

Example

To create a map, you need to first use a map builder which is where you will add your items. After you have added all your items, you'll need to call Build from the map builder to create a map. If you have a large number of items, like more than a hundred thousand, building the map would take a while (from a few seconds to several minutes) to finish.

import "github.com/robskie/chd"

// Sample Item structure
type Item struct {
  Key   string
  Value int
}

items := []Item{
  {"ab", 0},
  {"cd", 1},
  {"ef", 2},
  {"gh", 3},
  {"ij", 4},
  {"kl", 5},
}

// Create a builder and add keys to it
builder := NewBuilder(nil)
for _, item := range items {
  builder.Add([]byte(item.Key))
}

// Build the map
m, _ := builder.Build()

// Rearrange items according to its map index
items = append(items, make([]Item, m.Cap()-len(items))...)
for i := 0; i < len(items); {
  idx := m.Get([]byte(items[i].key))

  if i != idx && len(items[i].key) > 0 {
    items[i], items[idx] = items[idx], items[i]
    continue
  }

  i++
}

// Do something useful with the map

You can also serialize a map and deserialize it later by using Map.Read and Map.Write methods. Like the Builder.Build method, you need to pass a succinct array that implements CompactArray when deserializing. This should be the same as the one used in building the map.

// Serialize the map
w, _ := os.Create("mymap.dat")
m.Write(w)

// Afterwards, you can deserialize it
r, _ := os.Open("mymap.dat")
nm := chd.NewMap()
nm.Read(r)

// Do something useful with the map

API Reference

Godoc documentation can be found here.

Benchmarks

A Core i5 running at 2.3GHz is used for these benchmarks. You can see here that Builder.Build's running time is directly proportional to the number of keys while the Map.Get's execution time is dependent on the speed of the CompactArray used to create the map.

You can run these benchmarks on your machine by typing this command go test github.com/robskie/chd -bench=.* in terminal.

BenchmarkBuild10KKeys-4           30       46166731 ns/op
BenchmarkBuild100KKeys-4           2      672838604 ns/op
BenchmarkBuild1MKeys-4             1    13144765689 ns/op
BenchmarkMapGet100KKeys-4   10000000            133 ns/op

Documentation

Overview

Package chd implements the compress, hash, and displace (CHD) minimal perfect hash algorithm. It provides a map builder that manages adding of items and map creation.

See http://cmph.sourceforge.net/papers/esa09.pdf for more details.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func SetCompactArray

func SetCompactArray(a CompactArray)

SetCompactArray sets the compressed array used to store hash indices. It is important that the compact array used when building and reading a map is exactly the same.

Types

type BuildOptions

type BuildOptions struct {
	// LoadFactor sets the load
	// factor. Lower values results
	// in faster build times. Default
	// value is 1.0
	//
	// If ForceBuild is enabled the
	// actual load factor may differ
	// significantly from the set value.
	LoadFactor float64

	// BucketSize sets the average
	// number of keys per bucket.
	// Default value is 5.
	BucketSize int

	// ForceBuild indicates that the
	// Builder.Build method will always
	// succeed. This is done by decreasing
	// the load factor every time it fails.
	// Default value is true.
	ForceBuild bool

	Trials int
}

BuildOptions specifies the options for building a map.

func NewBuildOptions

func NewBuildOptions() *BuildOptions

NewBuildOptions creates build options with default values.

type Builder

type Builder struct {
	// contains filtered or unexported fields
}

Builder manages adding of items and map creation.

func NewBuilder

func NewBuilder(opts *BuildOptions) *Builder

NewBuilder returns a new map builder given the build options. If opts is nil, the default values are used.

func (*Builder) Add

func (b *Builder) Add(key, value []byte)

Add adds a given key to the builder.

func (*Builder) Build

func (b *Builder) Build() (m *Map, err error)

Build creates a map.

type CompactArray

type CompactArray interface {
	Add(value int)
	Get(index int) int

	Len() int
	Size() int
}

CompactArray represents a compressed integer array.

type Map

type Map struct {
	// contains filtered or unexported fields
}

Map represents a map that uses CHD minimal perfect hash algorithm.

func NewMap

func NewMap() *Map

NewMap returns an empty map. Call Map.Read to populate it.

func (*Map) Get

func (m *Map) Get(key []byte) []byte

func (*Map) GetRandomKey

func (m *Map) GetRandomKey() []byte

Get a random entry from the hash table

func (*Map) GetRandomValue

func (m *Map) GetRandomValue() []byte

Get a random entry from the hash table

func (*Map) Len

func (m *Map) Len() int

Len returns the total number of keys.

func (*Map) Read

func (m *Map) Read(p []byte) (n int, err error)

func (*Map) Size

func (m *Map) Size() int

Size returns the size in bytes.

func (*Map) Values

func (m *Map) Values(yield func([]byte) bool) bool

func (*Map) WriteTo

func (m *Map) WriteTo(w io.Writer) (n int64, err error)

Serialize to given Writer

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL