codeslower.com Savor Your Code.

A Zip Implementation in C#

Over on his B# blog, Bart De Smet recently showcased the new Zip method that will be added to LINQ in C# 4.0. He starts by giving an implementation of Zip built out of existing operators and asks for alternatives. This post is about my version.

For those who aren't familiar with Zip, it takes two lists and "zippers" them together. In Haskell, the Prelude zip takes two lists and turns them into a list of pairs:

  zip :: [a] -> [b] -> [(a, b)]

If the lists are different lengths, then the remainder of the longer list is discarded. Anyways, here is my implementation using C# 3.0:

public static IEnumerable<TResult> Zip<T1, T2, TResult>(this IEnumerable<T1> s1, 
  IEnumerable<T2> s2, Func<T1, T2, TResult> func)
{
  if(s1.Count() > s2.Count()) 
    s1 = s1.Take(s2.Count());

  return s1.SelectMany((__, idxS1) => s2.Skip(idxS1).Take(1),
    (itemS1, itemS2) => func(itemS1, itemS2));
}

This is a bit more general than the Haskell version, in that the func argument will be given an element from each list and will return some result. It is not limited to just making pairs. For some quick examples, take these two lists:

List<int> first = (new int[] { 1, 2, 3 }).ToList();
List<int> second = (new int[] { 10, 11, 12, 13 }).ToList();

They can be added together like so:

first.Zip(second, (a, b) => a + b)

Or turned into pair-looking strings:

first.Zip(second, (a, b) => "(" + a + ", " + b + ")")

Or even turned into a sequence of two-element lists:

first.Zip(second, (a, b) => (new int[] {a, b}).ToList())

Bailing out is supported by checking the length of each list. The shorter list gets to "drive" the iteration. If s1 is longer, we just take s2.Count() number of elements from it and then iterate. If not, we just use s1 as it is either shorter than or the same length as s2.

The zip itself is accomplished with SelectMany. SelectMany gives each element in s1, with its index, to a "collection selector" function:

(__, idxS1) => s2.Skip(idxS1).Take(1)

Our function use Skip and Take on s2 to get the appropriate element from the second sequence (Reader Question: Why didn't we use ElementAt?). Note I also "named" the first parameter to the function "__" -- something I picked up from Haskell when I don't care about a parameter.

SelectMany then invokes the "result selector" function on each element in the original sequence and each element obtained from the "collection selector". We then apply func to accomplish the "zipping" of the two elements:

(itemS1, itemS2) => func(itemS1, itemS2)

SelectMany really does the work of zipping here, but we've just made it easier to use.

For comparison, here is Bart's example:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
       this IEnumerable<TFirst> first, 
       IEnumerable<TSecond> second, 
       Func<TFirst, TSecond, TResult> func)
{
    return first.Select((x, i) => new { X = x, I = i }).
      Join(second.Select((x, i) => new { X = x, I = i }), 
           o => o.I, 
           i => i.I, 
           (o, i) => func(o.X, i.X));
} 

He's taken a hybrid functional and relational DB approach. It's functional because he builds an anonymous type for each element in each sequence. That object pairs the element with its index ("Select((x, i) => new { X = x, I = i })", etc.). It's relational because he finds corresponding elements by joining the two sequences on the indices. He then applies func to the joined elements, giving the same results as my implementation above.

The code for this implementation, along with a few other examples not shown, can be found on GitHub:

http://github.com/m4dc4p/zipcsharp/tree/master

You can download all files using the "Download" button found there or use git to clone them yourself.

Comments, feedback, and patches welcome!

Category: None

Please login to comment.

2 Comments

  1. Cool by Func N. Stein (2008-12-05)

    Nice trick.

  2. Re: Article by Fred (2008-12-09)

    "This is a bit more general than the Haskell version, in that the func argument will be given an element from each list and will return some result."

    It is much like zipWith in fact: Prelude> zipWith (x y -> (x, y)) [1,2,3] [4,5,6] [(1,4),(2,5),(3,6)] Prelude> zip [1,2,3] [4,5,6] [(1,4),(2,5),(3,6)] Prelude> zipWith (+) [1,2,3] [4,5,6] [5,7,9]