Gieno  Collection in C#

Here is how I used LINQ to learn about what collections really are.

Collection initializers
With LINQ, the Language INtegrated Query framework, we're enabling a more expression-oriented style of programming. For instance it should be possible to create and intialize an object within one expression. For collections, initialization typically amounts to adding an initial set of elements. Hence collection initializers in C# 3.0 look like this:

new MyNames { "Gieno Miao""Ninine Wang""Charlie Calvert" }
The meaning of this new syntax is simply to create an instance of MyNames using its no-arg constructor (constructor arguments can be supplied if necessary) and call its Add method with each of the strings.
So what types do we allow collection initializers on? Easy: collection types. What are those? Obvious: types that implement ICollection. This is a nice and easy design - ICollection ensures that you have an Add method so obviously that is the one that gets called for each element in the collection initializer. It is strongly typed, too - the initializer can contain only elements of the appropriate element type. In the above new expression, MyNames would be a class that implements ICollectionand everything works smoothly from there.

LINQ to LINQ
Well, nobody is a strong word. But we did an extensive study of our own framework classes, and found only a few that did. How? Using LINQ of course. The following query does the trick:
from name in assemblyNames
  select Assembly.LoadWithPartialName(name) into a
  from c 
in a.GetTypes()
  
where c.IsPublic &&
     c.GetConstructors().Any(m 
=> m.IsPublic) &&
     GetInterfaceTypes(c).Contains(
typeof(ICollection<>))
  select c.FullName;
Let’s go through this query a little bit and see what it does. For each name in a list of assemblyNames that we pre-baked for the purpose, load up the corresponding assembly:
from name in assemblyNames
  select Assembly.LoadWithPartialName(name)
One at a time, put the reflection objects representing these assemblies into a, and for each assemblya run through the types c defined in there:
from c in a.GetTypes()

Filter through, keeping each type only if it
      a) IsPublic
      b) has Any constructor that IsPublic
      c) implements ICollection for some T:

where c.IsPublic &&
     c.GetConstructors().Any(m 
=> m.IsPublic) &&
     GetInterfaceTypes(c).Contains(
typeof(ICollection<>))

For those that pass this test, select out their full name:

select c.FullName;

Nothing to it, really.

What is a collection?
What did we find then? Only 14 of our own (public) classes (with public constructors) implementICollection! Obviously there are a lot more collections in the framework, so it was clear that we needed some other way of telling whether something is a collection class. LINQ to the rescue once more: With modified versions of the query it was easy to establish that among our public classes with public constructors there are:
      a) Insert the argument into a collection, or
      b) Return the arithmetic sum of the argument and the receiver.
People are actually very good at (directly or indirectly) implementing the nongeneric IEnumerableinterface when writing collection classes, so that turns out to be a pretty reliable indicator of whether an Add method is the first or the second kind. Thus for our purposes the operational answer to the headline question becomes:
      A collection is a type that implements IEnumerable and has a public Add method.

Which Add to call?
We ain’t done yet, though. Further LINQ queries over the 189 collection types identified above show:
      28 collection types have more than one Add method
      30 collection types have no Add method with just one argument

So, given that our collection initializers are supposed to call “the” Add method which one should they call? It seems that there will be some value in collection initializers allowing you to:
      a) choose which overload to call
      b) call Add methods with more than one argument
Our resolution to this is to refine our understanding of collection initializers a little bit. The list you provide is not a “list of elements to add”, but a “list of sets of arguments to Add methods”. If an entry in the list consists of multiple arguments to an Add method, these are enclosed in { curly braces }. This is actually immensely useful. For example, it allows you to Add key/value pairs to a dictionary, something we have had a number of requests for as a separate feature.
The initializer list does not have to be homogenous; we do separate overload resolution against Addmethods for each entry in the list.
So given a collection class

public class Plurals : IDictionary<string,string> {
  
public void Add(string singular, string plural); // implements IDictionary<string,string>.Add
  public void Add(string singular); // appends an “s” to the singular form
  public void Add(KeyValuePair<string,string> pair); // implements ICollection<KeyValuePair<string,string>>.Add
  …
}

We can write the following collection initializer:

Plurals myPlurals = new Plurals{ “collection”, { “query”, “queries” }, new KeyValuePair(“child”, “children”) };

which would make use of all the different Add methods on our collection class.

Is this right?
The resulting language design is a “pattern based” approach. It relies on users using a particular name for their methods in a way that is not checked by the compiler when they write it. If they go and change the name of Add to AddPair in one assembly, the compiler won’t complain about that, but instead about a collection initializer sitting somewhere else suddenly missing an overload to call.
Here I think it is instructive to look at our history. We already have pattern-based syntax in C# - the foreach pattern. Though not everybody realizes it, you can actually write a class that does not implement IEnumerable and have foreach work over it; as long as it contains a GetEnumerator method. What happens though is that people overwhelmingly choose to have the compiler help them keep it right by implementing the IEnumerable interface. In the same way I fully expect people to recognize the additional benefit of implementing ICollection in the future – not only can your collection be initialized, but the compiler checks it too.
And as always, happy coding.

 

 posted on 2009-09-17 13:22  Gieno  阅读(1397)  评论(14编辑  收藏  举报