Earlier this evening, I got Brahma working, running an identity query.
1: var result = from value in data
2: select value;
Admitted, the query doesn’t really seem to do much, just copy the input to the output. But lets take a moment to dig a little deeper and see how it all comes together.
Since we’re going to run this on the GPU, we first create our graphics API-specific provider that sets up the basics. Since this DirectX, we first create a GPUComputationProvider.
1: _provider = new GPUComputationProvider(_form.Handle, 256, 256);
As you can see we need a form handle to create the provider. There’s no way we can get around that since we’re using graphics API. The 256, 256 there denotes the initial size of the provider. This is going to go away, since each time we perform a computation the back buffer is automagically resized.
We now need data to perform computations on. The earlier version of Brahma used images as data. However, since the new Brahma is purely GPGPU, I have tried hard to remove all traces of GPU or graphics API related baggage. To this end, DataParallelArray2D (DataParallelArray1D is coming real soon) allows us to specify a delegate that can fill in its values (or provider a single value that is used to fill the array). Here, I initialize one such array with a 1.0 diagonal, and all other elements to 0.3.
1: DataParallelArray2D<float> data =
2: new DataParallelArray2D<float>(_provider, 256, 256,
3: (x,y) => (x == y) ? 1.0f: 0.3f);
DataParallelArray2D is generic, and I’ve used float as its type argument. At this time, float is the only supported generic argument for DataParallelArray2D. I plan to support float, Vector2, Vector3 and Vector4 in the future.
A question for readers, here: Should the data types I support be API specific, or should they be .NET data types? Supporting .NET data types would be good, since we’d have complete abstraction and switching API’s would only mean changing the provider. On the other hand, it would involve data transformation for each read or write, and this might prove expensive. Do let me know if you have any thoughts on the subject.
And finally, when all of this is done, we run our little query. In the background, this
- Creates a texture, fills it with the data you passed in
- Processes the expression tree, and creates the following HLSL
- This HLSL is then run on our diagonal array, which produces the exact same output.
- Since I still haven’t completed IQueryProvider.Execute, we access the values using DataParallelArray’s indexer. The moment we do this, Brahma fills it up with the values computed on the GPU.
1: sampler sampler0;
2: float4 main( float2 texCoord: TEXCOORD0 ): COLOR
3: {
4: float4 value = tex2D(sampler0, texCoord);
5: return value;
6: }
And a simple test later, we can see that the query did, indeed copy the contents exactly.
1: // We still can't enumerate through the results, so lets use the indexer
2: DataParallelArray2D<float> resultValues =
3: result as DataParallelArray2D<float>; // We need to cast this
4: for (int x = 0; x < 256; x++)
5: for (int y = 0; y < 256; y++)
6: Assert.AreEqual((x == y) ?
7: 1.0f :
8: 0.3f, resultValues[x, y]);
All of this may sound like a horribly complicated way of copying something, but when we need to perform more complicated transformations on the data, executing them on the GPU will yield far better performance than executing them on the CPU. Stay tuned.
Note: This version of Brahma has no releases. Use the Subversion repository to access the current source.
probably .net provider can be used with lazy evaluation to avoid unnecessary switching betwen the api-s?
By lazy evaluation, if you mean deferring execution of the shaders till the results are asked for, this is something I do have in mind. However, I don’t think it is possible to minimize moving the data to and from the GPU. This is completely controlled by the programmer.
Or did you mean something else entirely?