Today my students got to present some of their work at the Maryland Geographic Information Committee Quarterly meeting. On Monday we head to the University of Maryland at College Park to present our work with the Computer Science Department.
If you recall, I outlined our objective for the summer here, and provided some updated information here. Also, recall that we’ve posted the code on Github here. We’ve made good progress this week, including modifying the code to utilize the GDAL libraries for reading the raster DEM, and also have things working as a valid QGIS plug-in as well as a command line function. Stop by the github site and have a look at the code, or try running it on your own data.
If you recall, my concern was that GIS processes are big on data volume, but light on calculations per data element. With that concern in the back of my mind, we decided to generate a hillshade as it appears to be more computationally expensive than slope.
Well, some early results shed some light on what is going on. As you can see from Table 1, the CUDA version of our slope computation is around 1.5x faster than the serial version. That is good, but nothing to write home about. You’ll notice that the Total Time is less than the sum of the parts because we are not only using the GPU to perform parallel processing for the slope computation, but are also using multi-threading techniques to simultaneously read in data while we are ALSO performing the computations and writing out the data – so, this QGIS plug-in is doing massive GPU parallel processing and multi-core processing!
Table 1. Slope computation on a 1.5GB GeoTiff file shows that the CUDA implementation is about 1.5 times faster than a serial implementation for slope.Function CUDASerialInput8:152:30Computation3:486:15Output5:405:15TOTAL Time9:0014:00
Table 2 shows the results for the Hillshade computation. As I expected, adding additional computations really leveraged the GPU, as the CUDA version is almost 2x faster. And, notice that the computational time IS THE SAME for slope and hillshade. That means that even though we threw more data at the video card (which has about 1,000 cores), it processed the data in the same amount of time. Obviously we are being really efficient with the card.
Table 2. Hillshade computation shows that the CUDA version is almost 2x faster than the serial version.Function CUDASerialInput8:302:30Computation3:489:00Output5:095:00TOTAL Time9:0017:00
WHERE WE GO NEXT
In the last couple of hours the guys have figured out how to use the GDAL libraries for reading and writing, so I expect to see our Input time to drop dramatically. On this data set, I am hoping we can be 3x faster – we’ll see.
Also, I fed the guys a much larger file (12.5 GB Geotif). We were able to run the hillshade in 50 minutes! Currently, we can’t do any comparisons as QGIS keeps crashing, and our serial C++ code needs some fixing. I’m going to try and run hillshade in ArcGIS later in the week, and I’ll let you know how it goes.