Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Semi-Intrusive unordered associative containers: unordered_set, unordered_multiset 介入式無序關聯容器:unordered_set, unordered_multiset

unordered_set and unordered_multiset performance notes  unordered_set 和 unordered_multiset 的性能說明
unordered_set and unordered_multiset hooks  unordered_set 和 unordered_multiset 鉤子
unordered_set and unordered_multiset containers  unordered_set 和 unordered_multiset 容器
Example 例子
Custom bucket traits 定制化桶 traits

Boost.Intrusive also offers hashed containers that can be very useful to implement fast-lookup containers. These containers (unordered_set and unordered_multiset) are semi-intrusive containers: they need additional memory apart from the hook stored in the value_type. This additional memory must be passed in the constructor of the container.
Boost.Intrusive 也提供了散列容器,對於實現快速查找的容器非常有用。這些容器(unordered_setunordered_multiset) 是半介入式容器:它們需要除了內部鉤子以外的內存。

Unlike C++ TR1 unordered associative containers (which are also hashed containers), the contents of these semi-intrusive containers are not rehashed to maintain a load factor: that would require memory management and intrusive containers don't implement any memory management at all. However, the user can request an explicit rehashing passing a new bucket array. This also offers an additional guarantee over TR1 unordered associative containers: iterators are not invalidated when inserting an element in the container.
和 C++ TR1 無序關聯容器(它們也是散列容器)不同,這些半介入式容器中的內容是不能重散列以維護負載因子的:這樣會需要內存管理,而介入式容器是不實現任何內存管理 的。不過,用戶可以傳入一個新的桶數組來請求一次明確的重散列。這種方式也比 TR1 無序關聯容器多了一個保證:在容器中插入元素時不會導致迭代器失效

As with TR1 unordered associative containers, rehashing invalidates iterators, changes ordering between elements and changes which buckets elements appear in, but does not invalidate pointers or references to elements.
和 TR1 無序關聯容器一樣,重散列會使得迭代器失效,改變元素間的順序,改變元素所在的桶,但不會導致元素的指針或引用失效。

Apart from expected hash and equality function objects, Boost.Intrusive unordered associative containers' constructors take an argument specifying an auxiliary bucket vector to be used by the container.
除了所需的散列和相等性這兩個函數對像以外,Boost.Intrusive 無序關聯容器的構造函數還要接受一個參數,指定一個被容器使用的輔助桶 vector。

The size overhead for a hashed container is moderate: 1 pointer per value plus a bucket array per container. The size of an element of the bucket array is usually one pointer. To obtain a good performance hashed container, the bucket length is usually the same as the number of elements that the container contains, so a well-balanced hashed container (bucket_count() is equal to size() ) will have an equivalent overhead of two pointers per element.
散列容器的空間開銷為中等:每個值1個指針,再加上每個容器一個桶數組。桶數組中每個元素的大小通常為1個指針。要提供一個優越性能的散列容器,桶的長度通常要和容器中的元素數量相等,因此一個正常的散列容器(bucket_count() 等於 size())的空間開銷相當於每個元素2個指針。

An empty, non constant-time size unordered_set or unordered_multiset has also the size of bucket_count() pointers.
一個空的、不帶常量時間 size 的 unordered_setunordered_multiset 也具有 bucket_count() 個指針的大小。

Insertions, erasures, and searches, have amortized constant-time complexity in hashed containers. However, some worst-case guarantees are linear. See unordered_set or unordered_multiset for complexity guarantees of each operation.
散列容器的插入、刪除和查找操作具有分期常量時間複雜度。但是,有些最壞情況的保證為線性複雜度。各個操作的複雜度保證請見 unordered_setunordered_multiset

Be careful with non constant-time size hashed containers: some operations, like empty(), have linear complexity, unlike other Boost.Intrusive containers.
留意不帶常量時間 size 的散列容器:有些操作,如 empty(),具有線性複雜度,這與其它 Boost.Intrusive 容器不同。

unordered_set and unordered_multiset share the same hooks. This is an advantage, because the same user type can be inserted first in a unordered_multiset and after that in unordered_set without changing the definition of the user class.
unordered_setunordered_multiset 共享相同的鉤子。這是一個優點,因為同一個用戶類型可以先被插入到 unordered_multiset 中,然後再插入到 unordered_set 中,而無需修改用戶類的定義。

template <class ...Options>
class unordered_set_base_hook;

template <class ...Options>
class unordered_set_member_hook;

unordered_set_base_hook and unordered_set_member_hook receive the same options explained in the section How to use Boost.Intrusive:
unordered_set_base_hookunordered_set_member_hook 接受在 如何使用 Boost.Intrusive 一節中說明的各種選項: 

  • tag<class Tag> (for base hooks only): This argument serves as a tag, so you can derive from more than one base hook. Default: tag<default_tag>.
    tag<class Tag> (只用於基類鉤子):該參數作為一個標記,你可以派生自多個 slist 鉤子。缺省值:tag<default_tag>.
  • link_mode<link_mode_type LinkMode>: The linking policy. Default: link_mode<safe_link>.
    link_mode<link_mode_type LinkMode>: 鏈接策略。缺省值:link_mode<safe_link>.
  • void_pointer<class VoidPointer>: The pointer type to be used internally in the hook and propagated to the container. Default: void_pointer<void*>.
    void_pointer<class VoidPointer>: 在鉤子內部使用並被傳遞給容器的指針類型。缺省值:void_pointer<void*>.

Apart from them, these hooks offer additional options:
除此之外,這些鉤子還提供了其它選項:

  • store_hash<bool Enabled>: This option reserves additional space in the hook to store the hash value of the object once it's introduced in the container. When this option is used, the unordered container will store the calculated hash value in the hook and rehashing operations won't need to recalculate the hash of the value. This option will improve the perfomance of unordered containers when rehashing is frequent or hashing the value is a slow operation. Default: store_hash<false>.
    store_hash<bool Enabled>: 這一選項在鉤子中保留額外的空間,在對像被加入到容器中時保存它的散列值。使用該選項時,無序容器將已經計算過的散列值保存在鉤子中,這樣在進行重散列時就無需重新計算散列值。如果重散列頻繁或散列值的計算較慢,則這一選項可以提高無序容器的性能。缺省值:store_hash<false>.
  • optimize_multikey<bool Enabled>: This option reserves additional space in the hook that will be used to group equal elements in unordered multisets, improving significantly the performance when many equal values are inserted in these containers. Default: optimize_multikey<false>.
    optimize_multikey<bool Enabled>: 這一選項在鉤子中保留額外的空間,用於將無序 multiset 中的相等元素聚在一起,當向容器中插入大量相等元素時,該選項可以顯著改進性能。缺省值:optimize_multikey<false>

template<class T, class ...Options>
class unordered_set;

template<class T, class ...Options>
class unordered_multiset;

As mentioned, unordered containers need an auxiliary array to work. Boost.Intrusive unordered containers receive this auxiliary array packed in a type called bucket_traits (which can be also customized by a container option). All unordered containers receive a bucket_traits object in their constructors. The default bucket_traits class is initialized with a pointer to an array of buckets and its size:
如上所述,無序容器需要一個輔助的數組來工作。Boost.Intrusive 無序容器接受一個包裝在 bucket_traits (也可以由容器的選項來定制化)類型中的輔助數組。所有無序容器在它們的構造函數中接受一個 bucket_traits 對象。缺省的 bucket_traits 類以一個桶數組指針及其大小來初始化:

#include <boost/intrusive/unordered_set.hpp>

using namespace boost::intrusive;

struct MyClass : public unordered_set_base_hook<>
{};

typedef unordered_set<MyClass>::bucket_type     bucket_type;
typedef unordered_set<MyClass>::bucket_traits   bucket_traits;

int main()
{
   bucket_type buckets[100];
   unordered_set<MyClass> uset(bucket_traits(buckets, 100));
   return 0;
}

Each hashed container needs its own bucket traits, that is, its own bucket vector. Two hashed containers can't share the same bucket_type elements. The bucket array must be destroyed after the container using it is destroyed, otherwise, the result is undefined.
散列容器需要它自己的桶 traits,即是它自己的桶 vector。兩個散列容器不能共享同一個 bucket_type 元素。桶數組必須在使用它的容器被銷毀之後銷毀,否則,結果將是未定義的。

unordered_set and unordered_multiset receive the same options explained in the section How to use Boost.Intrusive:
unordered_setunordered_multiset 接受在 如何使用 Boost.Intrusive 一節中說明的選項:

  • base_hook<class Hook> / member_hook<class T, class Hook, Hook T::* PtrToMember> / value_traits<class ValueTraits>: To specify the hook type or value traits used to configure the container. (To learn about value traits go to the section Containers with custom ValueTraits.)
    base_hook<class Hook> / member_hook<class T, class Hook, Hook T::* PtrToMember> / value_traits<class ValueTraits>: 指定用於配置容器的鉤子類型或 value traits。(要學習有關 value traits 的知識,請見 帶定制化 ValueTraits 的容器)
  • constant_time_size<bool Enabled>: To activate the constant-time size() operation. Default: constant_time_size<true>
    constant_time_size<bool Enabled>: 激活常量時間的 size() 操作。缺省值:constant_time_size<true>
  • size_type<bool Enabled>: To specify the type that will be used to store the size of the container. Default: size_type<std::size_t>
    size_type<bool Enabled>: 指定用於保存容器大小的類型。缺省值:size_type<std::size_t>.

And they also can receive additional options:
它們還可以接受其它選項:

  • equal<class Equal>: Equality function for the objects to be inserted in containers. Default: equal< std::equal_to<T> >
    equal<class Equal>: 被插入到容器中的對象所用的相等性函數。缺省值:equal< std::equal_to<T> >
  • hash<class Hash>: Hash function to be used in the container. Default: hash< boost::hash<T> >
    hash<class Hash>: 容器所用的散列函數。缺省值:hash< boost::hash<T> >
  • bucket_traits<class BucketTraits>: A type that wraps the bucket vector to be used by the unordered container. Default: a type initialized by the address and size of a bucket array and stores both variables internally.
    bucket_traits<class BucketTraits>: 無序容器所用的某個包裝了桶 vector 的類型。缺省值:一個由桶數組的地址及大小進行初始化並在內部保存這兩個變量的類型。
  • power_2_buckets<bool Enabled>: The user guarantees that only bucket arrays with power of two length will be used. The container will then use masks instead of modulo operations to obtain the bucket number from the hash value. Masks are faster than modulo operations and for some applications modulo operations can impose a considerable overhead. In debug mode an assertion will be raised if the user provides a bucket length that is not power of two. Default: power_2_buckets<false>.
    power_2_buckets<bool Enabled>: 用戶保證只使用長度為2的冪數的桶數組。容器將使用掩碼操作代替模操作來從散列值獲得桶的數字。掩碼操作比模操作更快,對於某些應用來說,模操作會帶來顯著的開銷。在調試模式下,如果用戶提供的桶的長度不是2的冪數,將引起一個斷言。缺省值:power_2_buckets<false>.
  • cache_begin<bool Enabled>: Due to its internal structure, finding the first element of an unordered container (begin() operation) is amortized constant-time. It's possible to speed up begin() and other operations related to it (like clear()) if the container caches internally the position of the first element. This imposes the overhead of one pointer to the size of the container. Default: cache_begin<false>.
    cache_begin<bool Enabled>: 由於內部結構的原因,查找無序容器的第一個元素(begin() 操作)是分期常量時間複雜度的。如果容器在內部對第一個元素的位置進行緩存,那麼就可以提高 begin() 和其它相關操作(如 clear())的速度。這意味著容器的大小要增加1個指針的開銷。缺省值:cache_begin<false>.
  • compare_hash<bool Enabled>: Note: this option requires store_hash<true> option in the hook. When the comparison function is expensive, (e.g. strings with a long common predicate) sometimes (specially when the load factor is high or we have many equivalent elements in an unordered_multiset and no optimize_multikey<> is activatedin the hook) the equality function is a performance problem. Two equal values must have equal hashes, so comparing the hash values of two elements before using the comparison functor can speed up some implementations.
    compare_hash<bool Enabled>: 注意:本選項要求鉤子中使用 store_hash<true> 選項。如果比較函數開銷較大(如帶有常見謂詞的字符串),有時(尤其是負載因子較高或在 unordered_multiset 中有很多相等元素且鉤子中沒有激活 optimize_multikey<> 的時候)相等性函數會存在性能的問題。兩個相等的值必然具有相等的散列值,所以在使用比較函數對像之前先比較散列值可以加快某些實現的速度。
  • incremental<bool Enabled>: Activates incremental hashing (also known as Linear Hashing). This option implies power_2_buckets<true> and the container will require power of two buckets. For more information on incremental hashing, see Linear hash on Wikipedia Default: incremental<false>
    incremental<bool Enabled>: 激活遞增散列(又稱線性散列)。該選項意味著 power_2_buckets<true> 以及該容器將要求桶數量為2的冪數。有關遞增散列的更多信息,請見 Linear hash on Wikipedia。缺省值:incremental<false>

Now let's see a small example using both hooks and both containers:
現在我們來看一個使用這些鉤子和容器的小例子:

#include <boost/intrusive/unordered_set.hpp>
#include <vector>
#include <algorithm>
#include <boost/functional/hash.hpp>

using namespace boost::intrusive;

class MyClass : public unordered_set_base_hook<>
{               //This is a derivation hook
int int_; public: unordered_set_member_hook<> member_hook_; //This is a member hook
MyClass(int i) : int_(i) {} friend bool operator== (const MyClass &a, const MyClass &b) { return a.int_ == b.int_; } friend std::size_t hash_value(const MyClass &value) { return std::size_t(value.int_); } }; //Define an unordered_set that will store MyClass objects using the base hook
typedef unordered_set<MyClass> BaseSet; //Define an unordered_multiset that will store MyClass using the member hook
typedef member_hook<MyClass, unordered_set_member_hook<>, &MyClass::member_hook_> MemberOption; typedef unordered_multiset< MyClass, MemberOption> MemberMultiSet; int main() { typedef std::vector<MyClass>::iterator VectIt; typedef std::vector<MyClass>::reverse_iterator VectRit; //Create a vector with 100 different MyClass objects
std::vector<MyClass> values; for(int i = 0; i < 100; ++i) values.push_back(MyClass(i)); //Create a copy of the vector
std::vector<MyClass> values2(values); //Create a bucket array for base_set
BaseSet::bucket_type base_buckets[100]; //Create a bucket array for member_multi_set
MemberMultiSet::bucket_type member_buckets[200]; //Create unordered containers taking buckets as arguments
BaseSet base_set(BaseSet::bucket_traits(base_buckets, 100)); MemberMultiSet member_multi_set (MemberMultiSet::bucket_traits(member_buckets, 200)); //Now insert values's elements in the unordered_set
for(VectIt it(values.begin()), itend(values.end()); it != itend; ++it) base_set.insert(*it); //Now insert values's and values2's elements in the unordered_multiset
for(VectIt it(values.begin()), itend(values.end()), it2(values2.begin()),itend2(values2.end()) ; it != itend; ++it, ++it2){ member_multi_set.insert(*it); member_multi_set.insert(*it2); } //Now find every element
{ VectIt it(values.begin()), itend(values.end()); for(; it != itend; ++it){ //base_set should contain one element for each key
if(base_set.count(*it) != 1) return 1; //member_multi_set should contain two elements for each key
if(member_multi_set.count(*it) != 2) return 1; } } return 0; }

Instead of using the default bucket_traits class to store the bucket array, a user can define his own class to store the bucket array using the bucket_traits<> option. A user-defined bucket-traits must fulfill the following interface:
除了使用缺省的 bucket_traits 類來保存桶數組以外,用戶也可以使用 bucket_traits<> 選項來定義自己的類,以保存桶數組。用戶自定義的 bucket-traits 必須遵守以下接口:

class my_bucket_traits
{
   bucket_ptr        bucket_begin();
   const_bucket_ptr  bucket_begin() const;
   std::size_t bucket_count() const;
};

The following bucket traits just stores a pointer to the bucket array but the size is a compile-time constant. Note the use of the auxiliary unordered_bucket and unordered_bucket_ptr utilities to obtain the type of the bucket and its pointer before defining the unordered container:
以下的桶 traits 只保存一個桶數組的指針,而桶數組的大小是一個編譯期常數。注意,在定義無序容器之前,使用輔助的 unordered_bucketunordered_bucket_ptr 工具可以得到桶的類型及其指針:

#include <boost/intrusive/unordered_set.hpp>
#include <boost/functional/hash.hpp>
#include <vector>

using namespace boost::intrusive;

//A class to be inserted in an unordered_set 插入到 unordered_set 中的一個類
class MyClass : public unordered_set_base_hook<> { int int_; public: MyClass(int i = 0) : int_(i) {} friend bool operator==(const MyClass &l, const MyClass &r) { return l.int_ == r.int_; } friend std::size_t hash_value(const MyClass &v) { return boost::hash_value(v.int_); } }; //Define the base hook option 定義基類鉤子選項
typedef base_hook< unordered_set_base_hook<> > BaseHookOption; //Obtain the types of the bucket and the bucket pointer 獲得桶的類型和桶的指針
typedef unordered_bucket<BaseHookOption>::type BucketType; typedef unordered_bucket_ptr<BaseHookOption>::type BucketPtr; //The custom bucket traits. 定制化的桶 traits
class custom_bucket_traits { public: static const int NumBuckets = 100; custom_bucket_traits(BucketPtr buckets) : buckets_(buckets) {} //Functions to be implemented by custom bucket traits 由定制化的桶 traits 所實現的函數
BucketPtr bucket_begin() const { return buckets_; } std::size_t bucket_count() const { return NumBuckets;} private: BucketPtr buckets_; }; //Define the container using the custom bucket traits 使用定制化的桶 traits 定義容器
typedef unordered_set<MyClass, bucket_traits<custom_bucket_traits> > BucketTraitsUset; int main() { typedef std::vector<MyClass>::iterator VectIt; std::vector<MyClass> values; //Fill values 填充值
for(int i = 0; i < 100; ++i) values.push_back(MyClass(i)); //Now create the bucket array and the custom bucket traits object 現在創建桶數組和定制化的桶 traits 對像
BucketType buckets[custom_bucket_traits::NumBuckets]; custom_bucket_traits btraits(buckets); //Now create the unordered set 現在創建無序集合
BucketTraitsUset uset(btraits); //Insert the values in the unordered set 將值插入到無序集合中
for(VectIt it(values.begin()), itend(values.end()); it != itend; ++it) uset.insert(*it); return 0; }


PrevUpHomeNext