When should I use the HashSet<T> type?

When should I use the HashSet<T> type?

I am exploring the HashSet<T> type, but I don't understand where it stands in collections.

Can one use it to replace a List<T>? I imagine the performance of a HashSet<T> to be better, but I couldn't see individual access to its elements.

Is it only for enumeration?

 

回答1

The important thing about HashSet<T> is right there in the name: it's a set. The only things you can do with a single set is to establish what its members are, and to check whether an item is a member.

Asking if you can retrieve a single element (e.g. set[45]) is misunderstanding the concept of the set. There's no such thing as the 45th element of a set. Items in a set have no ordering. The sets {1, 2, 3} and {2, 3, 1} are identical in every respect because they have the same membership, and membership is all that matters.

It's somewhat dangerous to iterate over a HashSet<T> because doing so imposes an order on the items in the set. That order is not really a property of the set. You should not rely on it. If ordering of the items in a collection is important to you, that collection isn't a set.

Sets are really limited and with unique members. On the other hand, they're really fast.

 

评论:

I think it's more correct to say that the order of the items in the HashSet is not defined, so don't rely on the iterator's order. If you iterate the set because you are doing something against the items in the set, that is not dangerous unless you are relying on anything related to order. A SortedSet has all the properties of the HashSet plus order, however SortedSet does not derive from HashSet; rephrased, a SortedSet is an ordered collection of distinct objects.
– Kit
Sep 15, 2016 at 21:38

 

回答2

Here's a real example of where I use a HashSet<string>:

Part of my syntax highlighter for UnrealScript files is a new feature that highlights Doxygen-style comments. I need to be able to tell if a @ or \ command is valid to determine whether to show it in gray (valid) or red (invalid). I have a HashSet<string> of all the valid commands, so whenever I hit a @xxx token in the lexer, I use validCommands.Contains(tokenText) as my O(1) validity check. I really don't care about anything except existence of the command in the set of valid commands. Lets look at the alternatives I faced:

  • Dictionary<string, ?>: What type do I use for the value? The value is meaningless since I'm just going to use ContainsKey. Note: Before .NET 3.0 this was the only choice for O(1) lookups - HashSet<T> was added for 3.0 and extended to implement ISet<T> for 4.0.
  • List<string>: If I keep the list sorted, I can use BinarySearch, which is O(log n) (didn't see this fact mentioned above). However, since my list of valid commands is a fixed list that never changes, this will never be more appropriate than simply...
  • string[]: Again, Array.BinarySearch gives O(log n) performance. If the list is short, this could be the best performing option. It always has less space overhead than HashSet, Dictionary, or List. Even with BinarySearch, it's not faster for large sets, but for small sets it'd be worth experimenting. Mine has several hundred items though, so I passed on this.

 

作者:Chuck Lu    GitHub    
posted @   ChuckLu  阅读(19)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
历史上的今天:
2021-08-17 游戏术语
2021-08-17 the user account is not authorized for remote login
2020-08-17 乐刻健身房 间隔天数
2015-08-17 Linq打印
2015-08-17 .net framework client profile
2015-08-17 Resharper中注释代码的快捷键
2015-08-17 What does the number on the visual studio solution icon represent?
点击右上角即可分享
微信分享提示