Transforming the CLR to numerical applications, and the nullspace that results.

Wednesday, July 30, 2008

A Money Type for the CLR

Treatment of numbers on the CLR lead me to an interesting lack in the Base Class Libraries... there is no money type.

I've done a bit of reading and research and produced what I hope is a work which addresses not only the issues which have already been cataloged with this particular type pattern, but also some nuances of implementing it on the CLR.

The article is on CodeProject, and the code base is hosted on CodePlex.

Comment and criticism welcome.

Enjoy!

Wednesday, July 2, 2008

Generate AssemblyInfo.cpp for C++/CLI

I manage a number of C++/CLI projects (which, for the most part, is a decent language extension, contrary to the expected detractors' views). For my C# projects, I enjoy being able to create the AssemblyInfo.cs file on the fly from properties I store in my SVN repository. It allows me a sort of "poor man's product management" where I can manage product name, description and major.minor version in one place, versioned along with the code.

This was easy using NAnt. I've recently been moving to MSBuild and have been sharing the same sort of view as my former colleague Travis Illig: NAnt is way more flexible and easy out of the box. On the other hand, as Scott points out in that post, much of it is a matter of learning the "MSBuild way" where Tasks and Targets are much more independent as well as adding some custom tasks to replace the missing ones from NAnt.

Enter the very nice collection called MSBuild Community Tasks. It filled in for the AssemblyInfo task I loved in NAnt. However, I got no such love from it for C++/CLI. MSBuild Tasks are quite easy to create, however (easier, even, than NAnt tasks, since fewer attributes are needed). I looked over the source for the custom AssemblyInfo task, and found it to be very easy to modify. Taking advantage of the CppCodeProvider which is new for .Net 2.0, I added C++/CLI support to it, added some tests, and contributed the patch back to the project. Paul Welter reviewed and applied it astonishingly quickly. Something I could learn from for SharpMap (before you complain, however, notice I said I included tests and a patch file ;) )...

So, if you're looking for an AssemblyInfo.cpp generator, or a number of other useful custom tasks, look no further than the well-tended MSBuild Community Tasks project.

Tuesday, June 24, 2008

Source Diff: Build SSCLI2.0 (Rotor) in VS 2008

For the Rotor fans out there who don't want to install VS 2005 on their new machines, I followed Jeremy Kuhne's excellent step-by-step instructions to produce a source tree diff which will get you up and going fast.

http://code.msdn.microsoft.com/BuildRotorInVS2008

Thursday, May 8, 2008

Salvia - a drug to help with understanding projective geometry?

I've been working hard to get a good grasp on projective geometry - reading as much as I can on books.google.com as well as purchasing what appears to be a key exposition for beginners - The Geometry of Incidence by Harold L. Dorwart.

However, it is quite a bit of work to get the brain to accept projective space. Understanding linear projection of vectors via the dot product was hard... projective geometry is a new level of hard.

There seems to be hope, though... a drug called Salvia which is becoming quite the buzz (no pun intended).

From a Slate article:

"Generally, smoked salvia effects come on quickly, peak for 5-20 minutes, and then begin to subside," the FAQ notes. Users report visions; feelings of fright; loss of physical coordination; uncontrollable laughter; confusion; feelings of being underground, or underwater, or flying, or floating; experiences of "non-Euclidean" spaces; and more, according to Erowid.

Experiences of non-Euclidian spaces? Like, perhaps "projective spaces"? Just what I've been searching for! Need to check if it is legal in my state...

Saturday, April 5, 2008

Intel announces next-generation SSE: AVX

I got an email from Intel yesterday, announcing what amounts to 256-bit SSE instructions. They call it AVX for Advanced Vector eXtensions. I presume this going to eventually rely on the forthcoming technology currently code-named Larrabee, which seeks to push the CPU and GPU closer together and addressable with everyday tools, unlike having to write DX shaders, or use some special C++-like language (e.g. Brook or CUDA). Doing a little sleuthing turns up some hack's report on the spring IDF which mentions that AVX will be a post-Nehalem (Sandy Bridge) CPU feature. However, I suspect we'll see more of a CPU-GPGPU blend once Nehalem is out.

The only thing I can really get my hands on right now is a marketing site: http://softwareprojects.intel.com/avx/, which has some interesting PDFs about how you can use these extensions, but no engineering samples to give away. Why not?

AVX should make graphics more fun, since we can pack 4 doubles into a register and do a matrix multiply on it. Since 4 doubles makes a 3D coordinate in projective space, highly accurate spatial processing should get a lot faster.

On a related note, I hope to get a Penryn soon, and will be able to finally use the DDOT SSE4.1 instruction on some vertex data. Projections should get a lot faster... we'll see!

Tuesday, March 4, 2008

MS Releases the Singularity (Managed OS) Research Dev Kit!

I was just daydreaming about what it would be like to write managed drivers over this past weekend...

What a treat!

http://www.codeplex.com/singularity

Tuesday, February 12, 2008

Using SSE2 to sort numbers

I created an article on CodeProject which describes the SSE2 Bubble Sort I came up with. I hope that SSE2 will be a big help in computing some of the line intersections in NetTopologySuite. We'll have to see, however, since SSE2 math is different than x87 math (64 bit vs. 80 bit). By using a fixed or integer precision model, this problem should be avoided.

Monday, January 21, 2008

Swapping byte encoding order of multi-byte numeric types in .Net (C# and VB.Net) to convert from little-endian to big-endian and back

I was looking to replace the mechanism of the swapByteOrder(Double) method in GeoAPI, since it used the BitConverter.GetBytes()/Array.Reverse()/BitConverter.ToDouble() approach. I had a feeling that this was horribly slow, given that the Array.Reverse() made me take a GC hit in a tight loop (which is how this function is often used).

Having just come off of reworking some NTS code which manipulated Double values using the bits in the number, I had gotten a much better feel of the IEEE 754 double-precision 64 bit layout. I could use BitConverter.DoubleToInt64Bits() without going pale. So, my first step was to do this - convert the Double to an Int64. I was going to use the approach I employed swapping the bytes of an Int32 (bit-shifting and or'ing) to handle Int64 values, when I had the idea that I'd check if anyone had a better idea of reversing the bytes of an Int64.

So, I was Live Searching1 for various methods of doing byte-order reversal in multi-byte numeric types, and came across this post: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2032515&SiteID=1. It surprised me that people were using pointers and Array.Reverse() over bit-shifting. Someone noted that the pointer version would be more highly performing, which was likely quite true, but that forces you to add the /unsafe switch to the C# compiler, which turns off verification checking and throws partial trust out the window.

So, I decided to test it.

The Methods

I found 4 methods to swap bytes in pure managed code, and created one function per method which, when given an Int64, returns that Int64 with the bytes order reversed. These might not be the best methods for all data, but it is a good sampling of what is out there.

The 4 methods and corresponding functions are:










MethodFunction
Bit ShiftingC#:
private static Int64 swapBitShift(Int64 value)
{
UInt64 uvalue = (UInt64)value;
UInt64 swapped = ((0x00000000000000FF) & (uvalue >> 56)
(0x000000000000FF00) & (uvalue >> 40)
(0x0000000000FF0000) & (uvalue >> 24)
(0x00000000FF000000) & (uvalue >> 8)
(0x000000FF00000000) & (uvalue << 8)
(0x0000FF0000000000) & (uvalue << 24)
(0x00FF000000000000) & (uvalue << 40)
(0xFF00000000000000) & (uvalue << 56));
return (Int64)swapped;
}
VB:
Function swapBitShift(ByVal value As Int64) As Int64
Dim uvalue As UInt64 = CULng(value)
Dim swapped As UInt64 = ((&HFF) And (uvalue >> 56) _
Or (&HFF00) And (uvalue >> 40) _
Or (&HFF0000) And (uvalue >> 24) _
Or (&HFF000000) And (uvalue >> 8) _
Or (&HFF00000000) And (uvalue << 8) _
Or (&HFF0000000000) And (uvalue << 24) _
Or (&HFF000000000000) And (uvalue << 40) _
Or (&HFF00000000000000) And (uvalue << 56))
Return CLng(swapped)
End Function
Pointer Byte Swapping
private static Int64 swapPointerLoop(Int64 value)
{
unsafe
{
Int64 newValue = 0;
Byte* from = (byte*)&value;
Byte* to = (byte*)&newValue;

Int32 len = sizeof(Int64);

for (int i = 0; i < len; i++)
{
to[len - i] = from[i];
}

return newValue;
}
}
IPAddress.

HostToNetworkOrder

C#:
private static Int64 swapIpAddress(Int64 value)
{
return System.Net.IPAddress.HostToNetworkOrder(value);
}
VB:
Function swapIpAddress(ByVal value As Int64) As Int64
Return System.Net.IPAddress.HostToNetworkOrder(value)
End Function
Array.ReverseC#:
private static Int64 swapArrayReverse(Int64 value)
{
Byte[] bytes = BitConverter.GetBytes(value);
Array.Reverse(bytes);
return BitConverter.ToInt64(bytes, 0);
}
VB:
Function swapArrayReverse(ByVal value As Int64) As Int64
Dim bytes As Byte() = BitConverter.GetBytes(value)
Array.Reverse(bytes)
Return BitConverter.ToInt64(bytes, 0)
End Function

I ran each test 20 times and averaged the data. I was using the test computer throughout which simulates a system experiencing taxed resources2.


The Data












1,000 Int64s graphed on a linear scale 1,000 Int64s graphed on a logarithmic scale
10,000 Int64s graphed on a linear scale 10,000 Int64s graphed on a logarithmic scale
100,000 Int64s graphed on a linear scale 100,000 Int64s graphed on a logarithmic scale

Some Conclusions


It looks like bit shifting wins out, although it's probably negligibly more efficient than IPAddress.HostToNetworkOrder. The performance of this method surprised me - not that I thought that it was implemented less efficiently, but that the method call would cost more than it does. I was further surprised to see that IPAddress.HostToNetworkOrder(Int64) makes 6 method calls per invocation, in order to delegate flipping the bytes to IPAddress.HostToNetworkOrder(Int32) twice which further delegates to IPAddress.HostToNetworkOrder(Int16) twice each. I'm not sure how much of this the JIT compiler inlines, as there is some difference between the x86 code which it generates for IPAddress.HostToNetworkOrder and the bit shifting implementation - but whatever the difference, it isn't much.


Another surprise was the pointer method. It was much slower than I thought. Perhaps it was the implementation (which, granted, is naive when we know we are looking at arrays of bytes). Perhaps it was some overhead which the C# compiler puts in when running with the usual debug settings of Visual Studio. Perhaps it was the permission asserts which the security portion of the runtime has to put on the stack when running unsafe code. If anything, since I've considered using unsafe code to speed up numeric operations, this exercise was helpful in showing me that there are some pitfalls in using it which should be investigated beforehand.


Finally, the Array.Reverse method was much worse in terms of performance that I thought it was going to be. It was a full order of magnitude worse. With the two best methods - bit shifting and IPAddress.HostToNetworkOrder so accessible, I can think of no reason to continue to use this method.


1) Yes, I'm giving it another shot, although so far, it's... not bad, but not great.


2) Ok, it doesn't simulate it exactly, but I didn't want to stop using the computer. The biggest impact will be memory usage, and make Array.Reverse() look worse than if I hadn't been using it. This is probably closer to what we want to see, since memory usage from other processes on a machine is rarely zero.

Wednesday, January 16, 2008

SharpMap v2.0, GeoAPI v2.0, Proj.Net and NTS v2.0 progress

The next release of SharpMap is taking longer than imagined, since it wasn't quite as easy to transform all those JTS javaisms to .Netisms, especially when trying to be considerate of the whole functional-construct infusion that is LINQ, IronPython and F#. There is quite a legacy in numerical computing looking back through the annals of Python as well as the functional language side of the house, so no doubt ideas (and code) will step over from all that work into what could be fairly decent geometry stack in .Net: SharpMap, GeoAPI, Proj.Net and NTS.

However, recent check-ins to the v2.0 branches of GeoAPI.Net and NTS have begun to compile again, and the basic functionality should be up and running. SharpMap v2.0 Beta 2 will be released once all tests are passing, code coverage is up to 60% or so.

Tuesday, January 15, 2008

Quick Intro

This is codekaizen. I work on SharpMap, GeoAPI.Net, NPack, and, in order to support SharpMap, a (temporary?) fork of Proj.Net.

I have had to learn a lot about linear algebra, numerical methods and their implementations, computational geometry and associated algorithms, spatial data structures and indexing, limits of machine representation of vector components and real numbers, and the architecture of "vector processors" like the SSE instruction set or GPUs to speed up, sometimes by an order of magnitude, the core computation primitives in reasoning and displaying spatial data.

With all this swirling around in my head, I need an outlet. So it is upon you, oh gentle reader, that I sling the clay of awareness and hope to mold it into the art of understanding and craft of proficiency. I'll admit that I don't really have a deep and abiding grasp of the math. I also don't have breadth in the application of the algorithms and data structures to a variety of situations. Nonetheless, I do have a firm grasp of the Microsoft CLR v2.0 and the underlying machine, and I need to explore how to apply the CLR to the problems of spatial reasoning, data visualization, and various numerical and heuristic methods to assist in finding patterns in data and helping to arrange and render the visualizations. As the graphics capability of commodity hardware becomes powerful enough to easily render any 2D and most 3D scenes, and processing becomes more distributed and abundant, real-time, everywhere-available applications which help us see and interpret our work and our lives more efficiently, more comprehensively and with greater insights. Computation is a natural extension of our minds, and we are compelled to greater heights of understanding - our .Net applications must, and will, accommodate. I hope to fill in the gaps between numerical and algorithmic theory and practice on the way.