GeoEpistemology uh where’s that?

Google Arunachal Pradesh

Fig 1 -Google Arunachal Pradesh as represented to .in, not to be confused with Arunachal Pradesh .cn

In the background of the internet lies this ongoing discussion of epistemology. It’s an important discussion with links to crowd source algos, big data, and even AI. Perhaps it’s a stretch to include maps, which after all mean to represent “exactitude in science” or JTB, Justified True Belief. On the one hand we have the prescience of Jorge Luis Borges concisely represented by his single paragraph short story.

Del rigor en la ciencia

… En aquel Imperio, el Arte de la Cartografía logró tal Perfección que el mapa de una sola Provincia ocupaba toda una Ciudad, y el mapa del Imperio, toda una Provincia. Con el tiempo, esos Mapas Desmesurados no satisfacieron y los Colegios de Cartógrafos levantaron un Mapa del Imperio, que tenía el tamaño del Imperio y coincidía puntualmente con él. Menos Adictas al Estudio de la Cartografía, las Generaciones Siguientes entendieron que ese dilatado Mapa era Inútil y no sin Impiedad lo entregaron a las Inclemencias del Sol y de los Inviernos. En los desiertos del Oeste perduran despedazadas Ruinas del Mapa, habitadas por Animales y por Mendigos; en todo el País no hay otra reliquia de las Disciplinas Geográficas.

-Suárez Miranda: Viajes de varones prudentes,
Libro Cuarto, Cap. XLV, Lérida, 1658

translation by Andrew Hurley

On Exactitude in Science

…In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast Map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.

-Suarez Miranda,Viajes de varones prudentes,
Libro IV,Cap. XLV, Lerida, 1658

As Jorge Luis Borges so aptly implies, the issue of epistemology swings between scientific exactitude and cultural fondness, an artistic reference to the unsettling observations of Thomas Kuhn’s paradigm shiftiness, The Structure of Scientific Revolutions .

Precession of Simulacra

On the other hand Jean Baudrillard would prefer an inversion of Borges in his Simulacra and Simulation

“The territory no longer precedes the map, nor does it survive it. It is nevertheless the map that precedes the territory—precession of simulacra—that engenders the territory”

In a less postmodern sense we can point to the recent spectacle of Nicaraguan sovereignty extending into Costa Rica, provoked by the preceding Google Map error, as a very literal “precession of simulacrum.” See details in Wired.

We now have map border wars and a crafty Google expedient of representing the Arunachal Pradesh according to client language. China sees one thing, but India another, and all are happy. So maps are not exempt from geopolitical machinations any more than Wikipedia. Of course the secular bias of Google invents an agnostic viewpoint of neither here nor there, in its course presuming a superior vantage and relegating “simplistic” nationalism to a subjected role of global ignorance. Not unexpectedly, global corporations wield power globally and therefore their interests lie supra nationally.

Perhaps in a Jean Baudrillard world the DPRK could disappear for ROK viewers and vice versa resolving a particularly long lived conflict.

Filter Bubbles

We are all more or less familiar with the filter bubble phenomenon. Your every wish is my command.

“The best books, he perceived, are those that tell you what you know already.”
George Orwell, 1984 p185

The consumer is king and this holds true in search and advertising as well as in Aladdin’s tale. Search filters at the behest of advertising money work very well at fencing us into smaller and smaller bubbles of our own desire. The danger of self-referential input is well known as narcissism. We see this at work in contextual map bubbles displaying only relevant points of interest from past searches.

With google glasses self-referential virtual objects can literally mask any objectionable reality. Should a business desire to pop a filter bubble only a bit more money is required. In the end, map POI algorithms dictate desire by limiting context. Are “personalized” maps a hint of precession of simulacra or simply one more example of rampant technical narcissism?


In the political realm elitists such as Cass Sunstein want to nudge us, which is a yearning of all mildly totalitarian states. Although cognitive infiltration will do in a pinch, “a boot stamping on a human face” is reserved for a last resort. How might the precession of simulacra assist the fulfillment of Orwellian dreams?

Naturally, political realities are less interested in our desires than their own. This is apparently a property of organizational ascendancy. Whether corporations or state agencies, at some point of critical mass organizations gain a life of their own. The organization eventually becomes predatory, preying on those they serve for survival. Political information bubbles are less about individual desires than survival of the state. To be blunt “nudge” is a euphemism for good old propaganda.

propaganda map

Fig 2 - Propaganda Map - more of a shove than a nudge

The line from Sunstein to a Clinton, of either gender, is short. Hillary Clinton has long decried the chaotic democracy of page ranked search algorithms. After noting that any and all ideas, even uncomfortable truths, can surface virally in a Drudge effect, Hillary would insist “we are all going to have to rethink how we deal with the Internet.” At least she seems to have thought creatively about State Dept emails. Truth is more than a bit horrifying to oligarchs of all types, as revealed by the treatment of Edward Snowden, Julian Assange, and Barrett Brown.

Truth Vaults

Enter Google’s aspiration to Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources. In other words a “truth page ranking” to supplant the venerable but messily democratic “link page ranking.” Why, after all, leave discretion or critical thought to the unqualified masses? For the history minded, this is rather reminiscent of pre-reformation exercise of Rome’s magisterium. We may soon see a Google Magisterium defining internet truth, albeit subject to FCC review.

“The net may be “neutral” but the FCC is most certainly not.”

According to Google: “Nothing but the truth.” I mean who could object? Well there seem to be some doubters among the hoi polloi. How then does this Google epistemology actually work? What exactly is Justified True Belief in Google’s Magisterium and how much does it effectively overlap with the politically powerful?

Leaving aside marginal Gettier-cases there are some pressing questions about the mechanics of KBT. In Google’s KBT basement is this thing called Knowledge Vault – Knowledge Graph.

“The fact extraction process we use is based on the Knowledge Vault (KV) project.”

“Knowledge Vault has pulled in 1.6 billion facts to date. Of these, 271 million are rated as “confident facts”, to which Google’s model ascribes a more than 90 per cent chance of being true. It does this by cross-referencing new facts with what it already knows.”

“Google’s Knowledge Graph is currently bigger than the Knowledge Vault, but it only includes manually integrated sources such as the CIA Factbook.”

“This is the most visionary thing,” says Suchanek. “The Knowledge Vault can model history and society.”

Per Jean Baudrillard read “model” as a verb rather than a thing. Google (is it possible to do this unwittingly?) arrogates a means to condition the present, in order to model the past, to control our future, to paraphrase the Orwellian syllogism.

“Who controls the past controls the future. Who controls the present controls the past.”
George Orwell, 1984

Not to be left behind MSNBC’s owner, Microsoft, harbors similar aspirations:

“LazyTruth developer Matt Stempeck, now the director of civic media at Microsoft New York, wants to develop software that exports the knowledge found in fact-checking services such as Snopes, PolitiFact, and so that everyone has easy access to them.”

And National Geographic too, all in for a new science: The Consensus of “Experts”

“Everybody should be questioning,” says McNutt. “That’s a hallmark of a scientist. But then they should use the scientific method, or trust people using the scientific method, to decide which way they fall on those questions.”

Ah yes the consensus of “Experts,” naturally leading to the JTB question, whose experts? IPCC may do well to reflect on Copernicus in regards to ancien régime and scientific consensus.

Snopes duo

Fig 3 - Snopes duo in the Truth Vault at Google and Microsoft? Does the cat break tie votes?

Google’s penchant for metrics and algorithmic “neutrality” neatly papers over the Mechanical Turk or two in the vault so to speak.

Future of simulacra

In a pre-digital Soviet era, map propaganda was an expensive proposition. Interestingly today Potemkin maps are an anachronistic cash cow with only marginal propaganda value. Tomorrow’s Potemkin maps according to Microsoft will be much more entertaining but also a bit creepy if coupled to brain interfaces. Brain controls are inevitably a two way street.

“Microsoft HoloLens understands your movements, vision, and voice, enabling you to interact with content and information in the most natural way possible.”

The only question is, who is interacting with content in the most un-natural way possible in the Truth Vault?

Will InfoCrafting at the brain interface be the next step for precession of simulacra?


Fig 5 – Leonie and $145 million worth of “InfoCrafting“ COINTEL with paid trolls and sock puppet armies


Is our cultural fondness leaning toward globally agnostic maps of infinite plasticity, one world per person? Jean Baudrillard would likely presume the Google relativistic map is the order of the day, where precession of simulacra induces a customized world generated in some kind of propagandistic nirvana, tailored for each individual.

But just perhaps, the subtle art of Jorge Luis Borges would speak to a future of less exactitude:

“still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.”

I suppose to be human is to straddle exactitude and art, never sure whether to land on truth or on beauty. Either way, we do well to Beware of Truth Vaults!

WebGL with a little help from Babylon.js

Most modern browsers now support HTML5 WebGL standard: Internet Explorer 11+, Firefox 4+, Google Chrome 9+, Opera 12+
One of the latest to the party is IE 11.


Fig 2 – html5 test site showing WebGL support for IE11

WebGL support means that GPU power is available to javascript developers in supporting browsers. GPU technology fuels the $46.5 billion “vicarious life” industry. Video gaming revenues surpass even Hollywood movie tickets in annual revenues, but this projection shows a falling revenue curve by 2019. Hard to say why the decline, but is it possibly an economic side effect of too much vicarious living? The relative merits of passive versus active forms of “vicarious living” are debatable, but as long as technology chases these vast sums of money, GPU geometry pipeline performance will continue to improve year over year.

WebGL exposes immediate mode graphics pipelines for fast 3D transforms, lighting, shading, animations, and other amazing stuff. GPU induced endorphin bursts do have their social consequences. Apparently, Huxley’s futuristic vision has won out over Orwell’s, at least in internet culture.

“In short, Orwell feared that what we fear will ruin us. Huxley feared that our desire will ruin us.”

Neil Postman Amusing Ourselves to Death.

Aside from the Soma like addictive qualities of game playing, game creation is actually a lot of work. Setting up WebGL scenes with objects, textures, shaders, transforms … is not a trivial task, which is where Dave Catuhe’s Babylon.js framework comes in. Dave has been building 3D engines for a long time. In fact I’ve played with some of Dave’s earlier efforts in Ye olde Silverlight days of yore.

“I am a real fan of 3D development. Since I was 16, I spent all my spare time creating 3d engines with various technologies (DirectX, OpenGL, Silverlight 5, pure software, etc.). My happiness was complete when I discovered that Internet Explorer 11 has native support for WebGL. So I decided to write once again a new 3D engine but this time using WebGL and my beloved JavaScript.”

Dave Catuhe Eternal Coding

Dave’s efforts improve with each iteration and Babylon.js is a wonderfully powerful yet simple to use javascript WebGL engine. The usefulness/complexity curve is a rising trend. To be sure a full fledged gaming environment is still a lot of work. With babylon.js much of the heavy lifting falls to the art design guys. From a mapping perspective I’m happy to forego the gaming, but still enjoy some impressive 3D map building with low effort.

In order to try out babylon.js I went back to an old standby, NASA Earth Observation data. NASA has kindly provided an OGC WMS server for their earth data. Brushing off some old code I made use of babylon.js to display NEO data on a rotating globe.

Babylon.js has innumerable samples and tutorials which makes learning easy for those of us less inclined to read manuals. This playground is an easy way to experiment: Babylon playground

Babylon.js engine is used to create a scene which is then handed off to engine.runRenderLoop. From a mapping perspective, most of the interesting stuff happens in createScene.

Here is a very basic globe:

<!DOCTYPE html>
<html xmlns="">
    <title>Babylon.js Globe</title>

    <script src=""></script>
        html, body {
            overflow: hidden;
            width: 100%;
            height: 100%;
            margin: 0;
            padding: 0;

        #renderCanvas {
            width: 100%;
            height: 100%;
            touch-action: none;

    <canvas id="renderCanvas"></canvas>

        var canvas = document.getElementById("renderCanvas");
        var engine = new BABYLON.Engine(canvas, true);

        var createScene = function () {
            var scene = new BABYLON.Scene(engine);

            // Light
            var light = new BABYLON.HemisphericLight("HemiLight", new BABYLON.Vector3(-2, 0, 0), scene);

            // Camera
            var camera = new BABYLON.ArcRotateCamera("Camera", -1.57, 1.0, 200, new BABYLON.Vector3.Zero(), scene);

            //Creation of a sphere
            //(name of the sphere, segments, diameter, scene)
            var sphere = BABYLON.Mesh.CreateSphere("sphere", 100.0, 100.0, scene);
            sphere.position = new BABYLON.Vector3(0, 0, 0);
            sphere.rotation.x = Math.PI;

            //Add material to sphere
            var groundMaterial = new BABYLON.StandardMaterial("mat", scene);
            groundMaterial.diffuseTexture = new BABYLON.Texture('textures/earth2.jpg', scene);
            sphere.material = groundMaterial;

            // Animations - rotate earth
            var alpha = 0;
            scene.beforeRender = function () {
                sphere.rotation.y = alpha;
                alpha -= 0.01;

            return scene;

        var scene = createScene();

        // Register a render loop to repeatedly render the scene
        engine.runRenderLoop(function () {

        // Watch for browser/canvas resize events
        window.addEventListener("resize", function () {

Fig 3- rotating Babylon.js globe

Add one line for a 3D effect using a normal (bump) map texture.

groundMaterial.bumpTexture = new BABYLON.Texture('textures/earthnormal2.jpg', scene);

Fig 4 – rotating Babylon.js globe with normal (bump) map texture

The textures applied to BABYLON.Mesh.CreateSphere required some transforms to map correctly.


Fig 5 – texture images require img.RotateFlip(RotateFlipType.Rotate90FlipY);

Without this image transform the resulting globe is more than a bit warped. It reminds me of a pangea timeline gone mad.


Fig 6 – globe with no texture image transform

Updating our globe texture skin requires a simple proxy that performs the img.RotateFlip after getting the requested NEO WMS image.

        public Stream GetMapFlip(string wmsurl)
            string message = "";
                HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(new Uri(wmsurl));
                using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
                    if (response.StatusDescription.Equals("OK"))
                        using (Image img = Image.FromStream(response.GetResponseStream()))
                            //rotate image 90 degrees, flip on Y axis
                            using (MemoryStream memoryStream = new MemoryStream()) {
                                img.Save(memoryStream, System.Drawing.Imaging.ImageFormat.Png);
                                WebOperationContext.Current.OutgoingResponse.ContentType = "image/png";
                                return new MemoryStream(memoryStream.ToArray());
                    else message = response.StatusDescription;
            catch (Exception e)
                message = e.Message;
            ASCIIEncoding encoding = new ASCIIEncoding();
            Byte[] errbytes = encoding.GetBytes("Err: " + message);
            return new MemoryStream(errbytes);

With texture in hand the globe can be updated adding hasAlpha true:

var overlayMaterial = new BABYLON.StandardMaterial("mat0", nasa.scene);
var nasaImageSrc = Constants.ServiceUrlOnline + "/GetMapFlip?url=" + nasa.image + "%26BGCOLOR=0xFFFFFF%26TRANSPARENT=TRUE%26SRS=EPSG:4326%26BBOX=-180.0,-90,180,90%26width=" + nasa.width + "%26height=" + nasa.height + "%26format=image/png%26Exceptions=text/xml";
       overlayMaterial.diffuseTexture = new BABYLON.Texture(nasaImageSrc, nasa.scene);
       overlayMaterial.bumpTexture = new BABYLON.Texture('textures/earthnormal2.jpg', nasa.scene);
       overlayMaterial.diffuseTexture.hasAlpha = true;
       nasa.sphere.material = overlayMaterial;

True hasAlpha lets us show a secondary earth texture through the NEO overlay where data was not collected. For example Bathymetry, GEBCO_BATHY, leaves holes for the continental masses that are transparent making the earth texture underneath visible. Alpha sliders could also be added to stack several NEO layers, but that’s another project.


Fig 7 – alpha bathymetry texture over earth texture

Since a rotating globe can be annoying it’s worthwhile adding a toggle switch for the rotation weary. One simple method is to make use of a Babylon pick event:

        window.addEventListener("click", function (evt) {
            var pickResult = nasa.scene.pick(evt.clientX, evt.clientY);
            if ( != "skyBox") {
                if (nasa.rotationRate < 0.0) nasa.rotationRate = 0.0;
                else nasa.rotationRate = -0.005;

In this case any click ray that intersects the globe will toggle globe rotation on and off. Click picking is a kind of collision checking for object intersection in the scene which could be very handy for adding globe interaction. In addition to, pickResult gives a pickedPoint location, which could be reverse transformed to a latitude,longitude.

Starbox (no coffee involved) is a quick way to add a surrounding background in 3D. It’s really just a BABYLON.Mesh.CreateBox big enough to engulf the earth sphere, a very limited kind of cosmos. The stars are not astronomically accurate just added for some mood setting.

Another handy BABYLON Feature is BABYLON.Mesh.CreateGroundFromHeightMap

/* Name
 * Height map picture url
 * mesh Width
 * mesh Height
 * Number of subdivisions (increase the complexity of this mesh)
 * Minimum height : The lowest level of the mesh
 * Maximum height : the highest level of the mesh
 * scene
 * Updatable: say if this mesh can be updated dynamically in the future (Boolean)

var height = BABYLON.Mesh.CreateGroundFromHeightMap("height", "textures/" + heightmap, 200, 100, 200, 0, 2, scene, false);

For example using a grayscale elevation image as a HeightMap will add exaggerated elevation values to a ground map:


Fig 8 – elevation grayscale jpeg for use in BABYLON HeightMap


Fig -9 – HeightMap applied

The HeightMap can be any value for example NEO monthly fires converted to grayscale will show fire density over the surface.


Fig 10 – NEO monthly fires as heightmap

In this case a first person shooter, FPS, camera was substituted for a generic ArcRotate Camera so users can stalk around the earth looking at fire spikes.

“FreeCamera – This is a ‘first person shooter’ (FPS) type of camera where you control the camera with the mouse and the cursors keys.”

Lots of camera choices are listed here including Oculus Rift which promises some truly immersive map opportunities. I assume this note indicates Babylon is waiting on the retail release of Oculus to finalize a camera controller.

“The OculusCamera works closely with our Babylon.js OculusController class. More will be written about that, soon, and nearby.

Another Note: In newer versions of Babylon.js, the OculusOrientedCamera constructor is no longer available, nor is its .BuildOculusStereoCamera function. Stay tuned for more information.”

So it may be only a bit longer before “vicarious life” downhill skiing opportunities are added to FreshyMap.



Fig 11 - NEO Land Surface average night temperature

Borders and Big Data


Fig 1 – Big Data Analytics is a lens, the data is a side effect of new media

I was reflecting on borders recently, possibly because of reading Cormac McCarthy’s The Border Trilogy. Borders come up fairly often in mapping:

  • Geography – national political borders, administrative borders
  • Cartography – border line styles, areal demarcation
  • Web Maps – pixel borders bounding polygonal event handlers
  • GIS – edges between nodes defining faces
  • Spatial DBs – Dimensionally Extended nine-Intersection Model (DE-9IM) 0,1 1,0 1,1 1,2 2,1

However, this is not about map borders – digital or otherwise.

McCarthy is definitely old school, if not Faulknerian. All fans of Neal Stephenson are excused. The Border Trilogy of course is all about a geographic border, the Southwest US Mexico border in particular. At other levels, McCarthy is rummaging about surfacing all sorts of borders: cultural borders, language borders (half the dialogue is Spanish), class borders, time borders (coming of age, epochal endings), moral borders with their many crossings. The setting is prewar 1930’s – 50’s, a pre-technology era as we now know it, and only McCarthy’s mastery of evocative language connects us to these times now lost.

A random excerpt illustrates:

“Because the outer door was open the flame in the glass fluttered and twisted and the little light that it afforded waxed and waned and threatened to expire entirely. The three of them bent over the poor pallet where the boy lay looked like ritual assassins. Bastante, the doctor said Bueno. He held up his dripping hands. They were dyed a rusty brown. The iodine moved in the pan like marbling blood. He nodded to the woman. Ponga el resto en el agua, he said. . . . “

The Crossing, Chapter III, p.24

Technology Borders
There are other borders, in our present preoccupation, for instance, “technology” borders. We’ve all recently crossed a new media border and are still feeling our way in the dark wondering where it may all lead. All we know for sure is that everything is changed. In some camps the euphoria is palpable, but vaguely disturbing. In others, change has only lately dawned on expiring regimes. Political realms are just now grappling with its meaning and consequence.

Big Data – Big Hopes
One of the more recent waves of the day is “Big Data,” by which is meant the collection and analysis of outlandishly large data sets, recently come to light as a side effect of new media. Search, location, communications, and social networks are all data gushers and the rush is on. There is no doubt that Big Data Analytics is powerful.

Disclosure: I’m currently paid to work on the periphery of a Big Data project, petabytes of live data compressed into cubes, pivoted, sliced, and doled out to a thread for visualizing geographically. My minor end of the Big Data shtick is the map. I am privy to neither data origins nor ends, but even without reading tea leaves, we can sense the forms and shadows of larger spheres snuffling in the night.

Analytics is used to learn from the past and hopefully see into the future, hence the rush to harness this new media power for business opportunism, and good old fashioned power politics. Big Data is an edge in financial markets where microseconds gain or lose fortunes. It can reveal opinion, cultural trends, markets, and social movements ahead of competitors. It can quantify lendibility, insurability, taxability, hireability, or securability. It’s an x-ray into social networks where appropriate pressure can gain advantage or thwart antagonists. Insight is the more benign side of Big Data. The other side, influence, attracts the powerful like bees to sugar.

Analytics is just the algorithm or lens to see forms in the chaos. The data itself is generated by new media gate keepers, the Googles, Twitters, and Facebooks of our new era, who are now in high demand, courted and feted by old regimes grappling for advantage.

Border Politics
Despite trenchant warnings by the likes of Nassim Taleb, “Beware the Big Errors of ‘Big Data’”, and Evgeny Morozov Net Delusion, the latest issue of “MIT Technology Review” declares in all caps:

“The mobile phone, the Net, and the spread of information —
a deadly combination for dictators”
MIT Tech Review


Dispelling the possibility of irony – feature articles in quick succession:

“A More Perfect Union”
“The definitive account of how the Obama campaign used big data to redefine politics.”
By Sasha Issenberg
“How Technology Has Restored the Soul of Politics”
“Longtime political operative Joe Trippi cheers the innovations of Obama 2012, saying they restored the primacy of the individual voter.”
By Joe Trippi
“Bono Sings the Praises of Technology”
“The musician and activist explains how technology provides the means to help us eradicate disease and extreme poverty.”
By Brian Bergstein

Whoa, anyone else feeling queasy? This has to be a classic case of Net Delusion! MIT Tech Review is notably the press ‘of technologists’, ‘by technologists’, and ‘for technologists’, but the hubris is striking even for academic and engineering types. The masters of technology are not especially sensitive to their own failings, after all, Google, the prima donna of new media, is anything but demure in its ambitions:

“Google’s mission is to organize the world’s information and make it universally accessible and useful.”
… and in unacknowledged fine print, ‘for Google’

Where power is apparent the powerful prevail, and who is more powerful than the State? Intersections of technologies often prove fertile ground for change, but change is transient, almost by definition. Old regimes accommodate new regimes, harnessing new technologies to old ends. The Mongol pony, machine gun, aeroplane, and nuclear fission bestowed very temporary technological advantage. It is not quite apparent what is inevitable about the demise of old regime power in the face of new information velocity.

What Big Data offers with one hand it takes away with the other. Little programs like “socially responsible curated treatment” or “cognitive infiltration” are only possible with Big Data analytics. Any powerful elite worthy of the name would love handy Ministry of Truth programs that steer opinion away from “dangerous” ideas.

“It is not because the truth is too difficult to see that we make mistakes… we make mistakes because the easiest and most comfortable course for us is to seek insight where it accords with our emotions – especially selfish ones.”

Alexander Solzhenitsyn

Utopian Borders
Techno utopianism, embarrassingly ardent in the Jan/Feb MIT Tech Review, blinds us to dangerous potentials. There is no historical precedent to presume an asymmetry of technology somehow inevitably biased to higher moral ends. Big Data technology is morally agnostic and only reflects the moral compass of its wielder. The idea that “… the spread of information is a deadly combination for dictators” may just as likely be “a deadly combination” for the naïve optimism of techno utopianism. Just ask an Iranian activist. When the bubble bursts, we will likely learn the hard way how the next psychopathic overlord will grasp the handles of new media technology, twisting big data in ways still unimaginable.

Big Data Big Brother?
Big Brother? US linked to new wave of censorship, surveillance on web
Forbes Big Data News Roundup
The Problem with Our Data Obsession
The Robot Will See You Now
Educating the Next Generation of Data Scientists
Moderated by Edd Dumbill (I’m not kidding)

Digital Dictatorship
Wily regimes like the DPRK can leverage primitive retro fashion brutality to insulate their populace from new media. Islamists master new media for more ancient forms of social pressure, sharia internet, fatwah by tweet. Oligarchies have co-opted the throttle of information, doling out artfully measured information and disinformation into the same stream. The elites of enlightened western societies adroitly harness new market methods for propagandizing their anaesthetized citizenry.

Have we missed anyone?

… and of moral borders
“The battle line between good and evil runs through the heart of every man”
The Gulag Archipelago, Alexander Solzhenitsyn


We have crossed the border. Everything is changed. Or is it?

Interestingly Cormac McCarthy is also the author of the Pulitzer Prize winning book, The Road, arguably about erasure of all borders, apparently taking up where techno enthusiasm left off.

Fig 2 – a poor man’s Big Data – GPU MapD – can you find your tweets?

(Silverlight BusinessApplication , Membership on Azure) => To Map || Not To Map

We owe Alonzo Church for this one:

    WebContext.Current.Authentication.LoggedIn += (se, ev) =>
       Link2.Visibility = Visibility.Visible;

    WebContext.Current.Authentication.LoggedOut += (se, ev) =>
        Link2.Visibility = Visibility.Collapsed;

The λ calculus “goes to” Lambda Expressions now in C#, turning mere programmers into logicians. We share in the bounty of Alonzo’s intellect in some remote fashion. Likely others understand this functional proof apparatus, but for me it’s a mechanical thing. Put this here and that there, slyly kicking an answer into existence. Is a view link visible or is it not?

Here we are not concerned with “the what” of a map but a small preliminary question, “whether to map at all?”

Azure provides a lot of cloud for the money. Between SQL Azure, blob storage, and hosted services, both web and worker roles, there are a lot of very useful resources. All, except SQL Azure, currently include built in scaling and replication. Azure SQL Server scaling is still awaiting the release of federation at some point in the future.

Visual Studio 2010 adds a Cloud toolkit with template ready services (not to be confused with “shovel ready”). One template that is interesting is the ASP .NET template combined with an Azure hosted Web Role service. The usefulness here is the full ASP .NET authentication and profile capability. If you need simple authentication, profile, and registration built on the ASP .NET model it is all here.

The goal then is to leverage the resourceful templates of ASP .NET authentication, modify this to a Silverlight version (which implies a Bing Map Control is in the future, along with some nifty transition graphics), and finally deploy to Azure.

And now for the recipe

authentication screen shot
Fig1 – open a new VS2010 project using the Cloud template

authentication screen shot
Fig2 – add an ASP .NET Web Role

authentication screen shot
Fig3 – click run to open in local development fabric with login and registration

This is helpful and works fine in a local dev fabric.

Now on to Azure:

First create a Storage Account and a Hosted Service.

authentication screen shot
Fig4 – Storage Account for use in publishing and diagnostics logs

authentication screen shot
Fig5 – Azure control manager

authentication screen shot
Fig6 – create a SQL Azure instance to hold the aspnetdb

Now using VS2010 publish the brand new ASP .NET web role to the new Azure account.

Problem 1
However, publish to Azure and you will immediately face a couple of problems. First, the published Azure service is a Web Role not a VM, which means there is no local SQL Server accessible for attaching a default aspnetdb. SQL Azure is in its own separate instance. You will need to manually create the aspnetdb Database on a SQL Azure instance and then provide a connection string to this new SQL Azure in the Web Role.

Microsoft provides sql scripts adapted for providing this to SQL Azure:

Running these scripts in your Azure DB will create the following membership database and tables:

authentication screen shot
Fig7 – aspnetdb SQL Azure database

Since I like Silverlight better it’s probably time to dump the plain Jane ASP .NET in favor of the Silverlight Navigation version. Silverlight provides animated fades and flips to which most of us are already unconsciously accustomed.

In Visual Studio add a new project from the Silverlight templates. The Silverlight Business Application is the one with membership.

authentication screen shot
Fig8 – Silverlight Business Application

We are now swapping out the old ASP .NET Cloud Web Role with a Silverlight BusinessApplication1.Web

Step 1
Add reference assemblies WindowsAzure Diagnostic, ServiceRuntime, and StorageClient to BusinessApplication1.Web

Step 2
Copy WebRole1 WebRole.cs to BusinessApplication1.Web and correct namespace

Step 3
CloudService1 Roles right click to add Role in solution – BusinessApplication.Web
(Optional remove WebRole1 from Roles and remove WebRole project since it is no longer needed)

Step 4
Add a DiagnosticsConnectionString property to the Roles/BusinessApplication1.Web role that points to the Azure blob storage. Now we can add a DiagnosticMonitor and trace logging to the WebRole.cs

Step 5
Open BusinessApplication1.Web web.config, add connection string pointed at SQL Azure aspnetdb and a set of ApplicationServices:

<?xml version="1.0"?>
    <sectionGroup name="system.serviceModel">
      <section name="domainServices"
 System.ServiceModel.DomainServices.Hosting, Version=, Culture=neutral,
PublicKeyToken=31BF3856AD364E35" allowDefinition="MachineToApplication"
requirePermission="false" />
        <remove name="LocalSqlServer"/>
        <add name="ApplicationServices" connectionString="Server=tcp:<SqlAzure
 providerName="System.Data.SqlClient" />

      <add name="DomainServiceModule"
 System.ServiceModel.DomainServices.Hosting, Version=, Culture=neutral,
 PublicKeyToken=31BF3856AD364E35" />
    <compilation debug="true" targetFramework="4.0" />

    <authentication mode="Forms">
      <forms name=".BusinessApplication1_ASPXAUTH" />

      <membership defaultProvider="AspNetSqlMembershipProvider">
              <add name="AspNetSqlMembershipProvider"
                   enablePasswordRetrieval="false" enablePasswordReset="true"
 requiresQuestionAndAnswer="false" requiresUniqueEmail="false"
                   maxInvalidPasswordAttempts="5" minRequiredPasswordLength="6"
 minRequiredNonalphanumericCharacters="0" passwordAttemptWindow="10"
                   applicationName="/" />

      <profile defaultProvider="AspNetSqlProfileProvider">
              <add name="AspNetSqlProfileProvider"
 connectionStringName="ApplicationServices" applicationName="/"/>
              <add name="FriendlyName"/>

      <roleManager enabled="false" defaultProvider="AspNetSqlRoleProvider">
              <add name="AspNetSqlRoleProvider"
 connectionStringName="ApplicationServices" applicationName="/" />
              <add name="AspNetWindowsTokenRoleProvider"
type="System.Web.Security.WindowsTokenRoleProvider" applicationName="/" />


    <validation validateIntegratedModeConfiguration="false"/>
    <modules runAllManagedModulesForAllRequests="true">
      <add name="DomainServiceModule" preCondition="managedHandler"
, System.ServiceModel.DomainServices.Hosting, Version=, Culture=neutral,
 PublicKeyToken=31AS3856AD234E35" />

    <serviceHostingEnvironment aspNetCompatibilityEnabled="true"
 multipleSiteBindingsEnabled="true" />

Now we can publish our updated Silverlight Web Role to Azure.

authentication screen shot
Fig9 – publish to Azure

However, this brings us to problem 2

It turns out that there are a couple of assemblies that are missing from Azure’s OS but required for our Silverlight membership BusinessApplication. When an assembly is missing publish will continue cycling between initializing and stop without ever giving a reason for the endless loop.

When this happens I know to start looking at the assembly dlls. Unfortunately they are numerous. Without a message indicating which is at fault we are reduced to trial and error. Obviously the turnaround on publish to Azure is not incredibly fast. Debug cycles will take five to ten minutes so this elimination process is tedious. However, I spent an afternoon on this and can save you some time. The trick assemblies are:

To fix this problem:
Open references under BusinessApplication1.Web and click on the above two assemblies. Then go to properties “Copy Local” which must be set to True. This means that a publish to Azure will also copy these two missing assemblies from your local VS2010 environment up to Azure and publish will eventually run as expected.

The publish process then brings us to this:

authentication screen shot
Fig10 – Login

And this:

authentication screen shot
Fig11 – Register

That in turn allows the Alonzo λ magic:

            WebContext.Current.Authentication.LoggedIn += (se, ev) =>
                Link2.Visibility = Visibility.Visible;


Well it works, maybe not as easily as one would expect for a Cloud template.

New Stuff

Some convergence stuff has been passing by my window recently. Too bad there isn’t more time to play, but here are a few items of note to the spatial world.

DataConnector screen shot
Silverlight 5 early early demo from Firestarter webcast

First a look ahead to Silverlight 5 due in 2011:

Silverlight Firestarter video

Some very interesting pre-release demos showcasing mesh graphics running at GPU speeds. WPF has had 3D mesh capability for a few years, but SL5 will add the same camera level control in 3D scene graphs to spice up delivery in the browser. There will be a whole new level of interesting possibilities with 3D in the browser as apparent in the Rosso demos above.

3D View of what?
So then 3D viewing, but how do you get a 3D image into the model in the first place? Well thanks to some links from Gene Roe, and interesting open source driver activity around the Kinect, here are some possibilities.

The obvious direction is a poor man’s 3d modeler. It won’t be too long until we see these devices in every corner of rooms supplementing the omnipresent security cameras with 3D views. (Nice to know that the security folks can wander around mirror land looking under tables and behind curtains.)

Well it could be worse, see DIY TSA scanner. But I doubt that Kinect will ever incorporate millimeter or back scatter sensors into the commercial versions.


I know we’ve all seen Jack Dangermond’s “instrumented universe” prophecies, but another angle is remote dashboarding. Put the instrument controls right on the floor in front of us and abstract the sensored world one step further into the mirror. That way heavy equipment operators can get nice aerobic workouts too.

Next Step

Can’t miss the augmented reality angle. Pico projector in a phone ought to handle this trick in the handheld world.

and then UGV

Why not let that Rumba do some modeling?

or UAV

Add a bit of UAV and we can move mirror land from the 30cm (1 foot) resolution capture soon to be added to Bing Maps (see Blaise Aguera y Arcas comments) to hmmm what looks like sub centimeter. Do we doubt that the venerable licensed surveyor will eventually have one of these GatewingX100 thing-a-ma-bobs in his truck.


So we move along. I am seeing some sanity in the WP7 spanning XNA and Silverlight in the same mobile device. Convergence into the 3D mirror land will be less painful for those of us developing in the Microsoft framework. HTML5 is nice and I’m happy to see the rebirth of SVG, but then again, the world moves apace, and I look forward to some genuine 3D adventures in the next generation mirror land.

Route Optimization and Death of a Salesman

Taking an arbitrary list of addresses and finding an optimized route has been a pretty common problem ever since traveling salesmen. It has been the bane of many computer science students leading to more than a few minor tragedies, as well as Arthur Miller’s great American tragedy. Technically known as an np-complete problem, computational time escalates quickly and tragically as the number of stops increase.

Bing includes some valuable Routing services like routing with waypoints, traffic optimization, optimization of time or distance, or even walking routes versus driving versus major routes. However, there are some limitations to Bing Routing Service , and even Bing could not prevent Willy Loman’s tragic demise.

One limitation of Bing Routing is that the maximum number of waypoint stops is 25. This is probably not a notable issue for individual users, but it is a problem for enterprise use with large numbers of stops to calculate.

Perhaps a broader issue often encountered is waypoint or stop order. Bing Route service does not attempt to reorder waypoints for optimal routing. Route segments between waypoints are optimized, but waypoints themselves are up to the user. Often stop points are coming from a SharePoint list, a contact address book, an Excel spreadsheet, or a database without benefit of sort ordering based on route optimization. This is where route optimization comes in handy.

OnTerra Systems has been involved with Fleet management for some time now and recently introduced a RouteOptimization Service.

There are of course fairly complex algorithms involved and computational scientists have been at some pains to circumscribe the limits of the problem. One simplification that makes this much easier is to ignore routing momentarily and use a shortest distance algorithm to make a first route optimization pass. One way this can be accomplished is by looking at a set of connected nodes and determining segment intersects. By recursively re–ordering nodes to eliminate segment intersects, the computer quickly turns the above node set into this:

The nodes are first ordered with the simplified assumption of straight line segment connectors, and only after a re-ordering is the usual Bing Route service triggered to get actual routes between nodes. This is one of those algorithms that take some liberties to find a solution in real time i.e. time we can live with. It doesn’t guarantee “The Absolute Best” route, but a route that is “good enough” and in the vast majority of cases will in fact be the best, just not guaranteed.

Of course mathematicians and purists cringe at the idea, however, that is ordinarily what engineering is all about, getting an approximate solution in less than the time constraint of a useable solution.

OnTerra Systems has added some other refinements to the service with the help of Bing Maps Silverlight Control. This UI uses both Bing Geocoding as well as Bing Route services. In addition it makes use of the simplified RouteOptimization service to make the route UI shown above. Users can choose whether to optimize for time or distance. Minor modifications could use traffic optimization, available in Bing, just as well. However, traffic optimization requires a different license and a bit more money.

Entering points can be accomplished several ways, with a click approach, from manually entered addresses, or even more easily from an Excel spreadsheet of addresses. Route definition can be set for round trip, one way to a known end destination, or one way to whichever end is the most efficient.

This UI is only one of many that could be used. As a WCF Service, RouteOptimization can take latitude, longitude point sets from any source and return the ordered points for display in any web UI.

This example Silverlight UI is a three step process of service chaining. First points are collected from the user. If addresses are listed rather than latitude longitude points, then Bing Geocode Service is called to arrive at the necessary list of latitude, longitude points. These in turn are handed to the RouteOptimizer Service. The returned points are now ordered and can be given to the Bing Route Service in the 3rd step of a service chain. Finally the resulting route is plotted in the Silverlight map UI.

Waypoint limit work around:
An interesting side note to RouteOptimization Service is the ability to extend the number of waypoints. Microsoft’s restriction to 25 waypoints is probably a scaling throttle to prevent service users from sending in hundreds and possibly thousands of waypoints for a single route. The Route Service could be compromised with many large count waypoint requests.

However, with the ability to order waypoints with an outside service this limit has a work around. First run an optimization on all of the stops using RouteOptimizer service. Now simply loop through the return set with multiple calls to Bing Route in 25 node chunks. This simple divide and conquer achieves a simple work around to the Waypoint limitation. Note that the public sample here is also limited to 25 stops. To use this work-around, you’ll need a licensed version.

Alternate UIs
Of course UIs may multiply. Route Savvy is primarily a service to which any number of UIs can connect and Silverlight happens to be a nice way to create useful UIs. Perhaps a more convenient UI is the Bing Map Gallery version found at Bing Maps Explore:
Bing Map Explore Route Savvy.

Linda Loman and the Parent Car Pool:
Anyone out there with kids in sports has probably run into this traveling salesman problem. (Fast forward to 2010 and Linda Loman jumps in the van with Biff ) There are 6 kids in the car pool. What is the optimal route for picking up all kids and delivering them to the playing field on time? If you want to know, just fire up RouteOptimizer and give it a whirl.

OnTerra Systems Route Optimization Service supplements the already valuable Bing Services by both optimizing input stop orders and extending the waypoint limitation. This is an illustration of the utility of web service chains. The ability to link many services into a solution chain is one of the dreams come true of web development and one reason why web applications continue to supersede the desktop world.

Please note, even though RouteOptimizer will help Linda Loman’s car pool run smoothly, it won’t prevent the death of a salesman. Sorry, only a higher plane optimization can do that.

Connecting the Data Dots – Hybrid Architecture

Web mapping has generally been a 3 tier proposition. In a typical small to medium scale scenario there will almost always be a data source residing in a spatial database, with some type of service in the middle relaying queries from a browser client UI to the database and back.

I’ve worked with all kinds of permutations of 3 tier web mapping, some easier to use than others. However, a few months ago I sat down with all the Microsoft offerings and worked out an example using SQL Server + WCF + Bing Maps Silverlight Control. Notice tools for all three tiers are available from the same vendor i.e. Microsoft Visual Studio. I have to admit that it is really nice to have integrated tools across the whole span. The Microsoft option has only been possible in the last year or so with the introduction of SQL Server 2008 and Bing Maps Silverlight Map control.

The resulting project is available on codeplex: dataconnector
and you can play with a version online here: DataConnectorUI

When working on the project, I was starting out in WCF with some reservations. SQL Spatial was similar to much of my earlier work with PostGIS. While Bing Maps Silverlight Control and XAML echoed work I’d done a decade back with SVG, just more so. However, for the middle tier I had generally used something from the OGC world such as GeoServer. Putting together my own middle tier service using WCF was largely experimental. WCF turned out to be less daunting than I had anticipated. In addition, it also afforded opportunity to try out a few different approaches for transferring spatial query results to the UI client.

There is more complete information on the project here:

After all the experimental approaches my conclusion is that even with the powerful CLR performance of Silverlight Bing Maps Control, most scenarios still call for a hybrid approach: raster performance tile pyramids for large extent or dense data resources, and vectors for better user interaction at lower levels in the pyramid.

Tile Pyramids don’t have to be static and DataConnector has examples of both static and dynamic tile sources. The static example is a little different from other approaches I’ve used, such as GeoWebCache, since it drops tiles into SQL Server as they are created rather than using a file system pyramid. I imagine that a straight static file system source could be a bit faster, but it is nice to have indexed quadkey access and all data residing in a single repository. This was actually a better choice when deploying to Azure, since I didn’t have to work out a blob storage option for the tiles in addition to the already available SQL Azure.

Hybrid web mapping:

Here are some of the tradeoffs involved between vector and raster tiles.

Hybrid architectures switch between the two depending on the client’s position in the resource pyramid. Here is my analysis of feature count ranges and optimal architecture:

1. Low – For vector poly feature counts < 300 features per viewport, I’d use vector queries from DB. Taking advantage of the SQL Reduce function makes it possible to drop node counts for polygons and polylines for lower zoom levels. Points are more efficient and up to 3000-5000 points per viewport are still possible.

2. Medium – For zoomlevels with counts > 300 per viewport, I’d use a dynamic tile builder at the top of the pyramid. I’m not sure what the upper limit is on performance here. I’ve only run it on fairly small tables 2000 records. Eventually dynamic tile building on the server effects performance (at the server not the client).

3. High – For zoomlevels with high feature counts and large poly node counts, I’d start with pre-seeded static tile at low zoomlevels at the top of the pyramid, perhaps dynamic tiles in the middle pyramid, and vectors at the bottom.

4. Very High – For very high feature counts at low zoomlevels near the top of the pyramid I’d just turn off the layer. There probably isn’t much reason to show very dense resources until the user moves in to an area of interest. For dense point sources a heat map raster overview would be best at the top of the pyramid. At middle levels I’d use a caching tile builder with vectors again at higher zoom levels at the bottom of the pyramid.

Here is a graphic view of some hybrid architectural options:
DataConnector screen shotDataConnector screen shot

DataConnector screen shotDataConnector screen shot

Data Distribution Geographically:
Another consideration is the distribution of data in the resource. Homogenous geographic data density works best with hard zoom level switches. In other words, the switch from vector to raster tile can be coded to zoomlevel regardless of where the client has panned in the extent of the data. This is simple to implement.

However where data is relatively heterogeneous geographically it might be nice to arrange switching according to density. An example might be parcel data densities that vary across urban and rural areas. Instead of simple zoom levels, the switch between tile and vector is based on density calculations. Having available a heat map overview, for example, could provide a quick viewport density calculation based on a pixel sum of the heat map intersecting with the user’s viewport. This density calculation would be used for the switch rather than a simpler zoom level switch. This way rural area of interest can gain the benefit of vectors higher in the pyramid than would be useful in urban areas.

DataConnector screen shotDataConnector screen shot

Point Layers:
Points have a slightly different twist. For one thing too many points clutter a map, while their simplicity means that more point vectors can be rendered before affecting UI performance. Heat Maps are a great way to show density at higher levels in the pyramid. Heat Maps can be dynamic tile source or a more generalized caching tile pyramid. In a point layer scenario at some level there is a switch from Heat Map to Cluster Icons, and then to individual Pushpins. Using power scaling at pushpin levels allows higher density pins to show higher in the pyramid without as much clutter. Power scaling hooks the icon size for the pin to zoomlevel. Experiments showed icon max limit for Bing Silverlight Map Control at 3000-5000 per viewport.

DataConnector screen shot

Some Caveats:
Tile pyramids are of course most efficient when the data is relatively static. With highly dynamic data, tiles can be built on the fly but with consequent loss of performance as well as loading on the server that affects scaling. In an intermediate situation with data that changes slowly, static tiles are still an option using a pre-seeding batch process run at some scheduled interval.

Batch tile loading also has limitations for very dense resources that require tiling down deep in a pyramid where the number of tiles grows very large. Seeding all levels of a deep pyramid requires some time, perhaps too much time. However, in a hybrid case this should rarely happen since the bottom levels of the pyramid are handled dynamically as vectors.

It is also worth noting that with Silverlight the client hardware affects performance. Silverlight has an advantage for web distribution. It distributes the cpu load out to the clients harnessing all their resources. However, an ancient client with old hardware performance will not match performance of newer machines.


A Hybrid zoom level based approach is the best general architecture for Silverlight web maps where large data sets are involved. Using your own service provides more options in architecting a web map solution.

Microsoft’s WCF is a nice framework especially since it leverages the same C# language, IDE, and debugging capabilities across all tiers of a solution. Microsoft is the only vendor out there with the breadth to give developers an integrated solution across the whole spectrum from UI graphics, to customizable services, to SQL spatial tables.

Then throw in global map and imagery resources from Bing, with Geocode, Routing, and Search services, along with the Azure cloud platform for scalability and it’s no wonder Microsoft is a hit with the enterprise. Microsoft’s full range single vendor package is obviously a powerful incentive in the enterprise world. Although relatively new to the game, Microsoft’s full court offense puts some pressure on traditional GIS vendors, especially in the web distribution side of the equation.

Media Playing down the LiDAR path

Fig 1 – Showing a sequence of LiDAR Profiles as a video clip

Video codecs harnessed to Silverlight MediaPlayer make a useful compression technique for large image sets. The video, in essence, is a pointer into a large library of image frames. Data collected in a framewise spatial sense can leverage this technique. One example where this can be useful is in Mobile Asset Collection or MAC UIs. MAC corridors can create large collections of image assets in short order.

Another potential for this approach is to make use of conventional LiDAR point cloud profiles. In this project I wanted to step through a route, collecting profiles for later viewing. The video approach seemed feasible after watching cached profiles spin through a view panel connected to a MouseWheel event in another project. With a little bit of effort I was able to take a set of stepwise profiles, turn them into a video, and then connect the resulting video to a Silverlight Map Control client hooked up to a Silverlight MediaPlayer. This involved three steps:

1. Create a set of frames

Here is a code snippet used to follow a simple track down Colfax Ave in Denver, west to east. I used this to repeatedly grab LidarServer WMS GetProfileData requests, and save the resulting .png images to a subdirectory. The parameters were set to sweep at a 250ft offset either side of the track with a10ft depth and a 1 ft step interval. The result after a couple of hours was 19164 .png profiles at 300px X 500px.

The code basically starts down the supplied path and calculates the sweep profile endpoints at each step using the Perpendicular function. This sweep line supplies the extent parameters for a WMS GetProfileData request.

private void CreateImages(object sender, RoutedEventArgs e)
    string colorization = "Elevation";
    string filter = "All";
    string graticule = "True";
    string drapeline = "False";
    double profileWidth = 300;
    double profileHeight = 500;

    double dx = 0;
    double dy = 0;
    double len = 0;
    double t = 0;
    string profileUrl = null;
    WebClient client = new WebClient();

    step = 57.2957795 * (step * 0.3048 / 6378137.0); //approx dec degree
    Point startpt = new Point();
    Point endpt = new Point();
    string[] lines = points.Text.Split('\r');
    startpt.X = double.Parse(lines[0].Split(',')[0]);
    startpt.Y = double.Parse(lines[0].Split(',')[1]);

    endpt.X = double.Parse(lines[1].Split(',')[0]);
    endpt.Y = double.Parse(lines[1].Split(',')[1]);

    dx = endpt.X - startpt.X;
    dy = endpt.Y - startpt.Y;
    len = Math.Sqrt(dx * dx + dy * dy);

    Line direction = new Line();
    direction.X1 = startpt.X;
    direction.Y1 = startpt.Y;
    width *= 0.3048;
    int cnt = 0;
    t = step / len;

    while (t <= 1)
        direction.X2 = startpt.X + dx * t;
        direction.Y2 = startpt.Y + dy * t;

        Point p0 = Perpendicular(direction, width / 2);
        Point p1 = Perpendicular(direction, -width / 2);

        p0 = Mercator(p0.X, p0.Y);
        p1 = Mercator(p1.X, p1.Y);

        profileUrl = ""+
                         "&LEFT_XY=" + p0.X + "%2C" + p0.Y + "&RIGHT_XY=" + p1.X + "%2C" + p1.Y +
                         "&DEPTH=" + depth + "&SHOW_DRAPELINE=" + drapeline + "&SHOW_GRATICULE=" + graticule +
                         "&COLORIZATION=" + colorization + "&FILTER=" + filter +
                         "&WIDTH=" + profileWidth + "&HEIGHT=" + profileHeight;

        byte[] bytes = client.DownloadData(new Uri(profileUrl));
        FileStream fs = File.Create(String.Format(workingDir+"img{0:00000}.png", cnt++));
        BinaryWriter bw = new BinaryWriter(fs);

        direction.X1 = direction.X2;
        direction.Y1 = direction.Y2;
        t += step / len;

private Point Perpendicular(Line ctrline, double dist)
    Point pt = new Point();
    Point p1 = Mercator(ctrline.X1, ctrline.Y1);
    Point p2 = Mercator(ctrline.X2, ctrline.Y2);

    double dx = p2.X - p1.X;
    double dy = p2.Y - p1.Y;
    double len = Math.Sqrt(dx * dx + dy * dy);
    double e = dist * (dx / len);
    double f = dist * (dy / len);

    pt.X = p1.X - f;
    pt.Y = p1.Y + e;
    pt = InverseMercator(pt.X, pt.Y);
    return pt;

2. Merge the .png frames into an AVI

Here is a helpful C# AviFile library wrapper. Even though it is a little old, the functions I wanted in this wrapper worked just fine. The following WPF project simply takes a set of png files and adds them one at a time to an avi clip. Since I chose the (Full)uncompressed option, I had to break my files into smaller sets to keep from running into the 4Gb limit on my 32bit system. In the end I had 7 avi clips to cover the 19,164 png frames.

Fig 2 – Create an AVI clip from png frames

using System.Drawing;
using System.Windows;
using AviFile;

namespace CreateAVI

    public partial class Window1 : Window
        public Window1()

        private void btnWrite_Click(object sender, RoutedEventArgs e)
            int startframe = int.Parse(start.Text);
            int frameInterval = int.Parse(interval.Text);
            double frameRate = double.Parse(fps.Text);
            int endframe = 0;

            string currentDirName = inputDir.Text;
            string[] files = System.IO.Directory.GetFiles(currentDirName, "*.png");
            if (files.Length > (startframe + frameInterval)) endframe = startframe + frameInterval;
            else endframe = files.Length;
            Bitmap bmp = (Bitmap)System.Drawing.Image.FromFile(files[startframe]);
            AviManager aviManager = new AviManager(@currentDirName + outputFile.Text, false);
            VideoStream aviStream = aviManager.AddVideoStream(true, frameRate, bmp);

            Bitmap bitmap;
            int count = 0;
            for (int n = startframe+1; n < endframe; n++)
                if (files[n].Trim().Length > 0)
                    bitmap = (Bitmap)Bitmap.FromFile(files[n]);

Next I used Microsoft Expression Encoder 3 to encode the set of avi files into a Silverlight optimized VC-1 Broadband variable bitrate wmv output, which expects a broadband connection for an average 1632 Kbps download. The whole path sweep takes about 12.5 minutes to view and 53.5Mb of disk space. I used a 25fps frame rate when building the avi files. Since the sweep step is 1ft this works out to about a 17mph speed down my route.

3. Add the wmv in a MediaPlayer and connect to Silverlight Map Control.


I used a similar approach for connecting a route path to a video described in “Azure Video and the Silverlight Path”. Expression Encoder 3 comes with a set of Silverlight MediaPlayers templates. I used the simple “SL3Standard” template in this case, but you can get fancier if you want.

Looking in the Expression Templates subdirectory “C:\Program Files\Microsoft Expression\Encoder 3\Templates\en”, select the ExpressionMediaPlayer.MediaPlayer template you would like to use. All of the templates start with a generic.xaml MediaPlayer template. Add the .\Source\MediaPlayer\Themes\generic.xaml to your project. Then look through this xaml for <Style TargetType=”local:MediaPlayer”>. Once a key name is added, this plain generic style can be referenced by MediaPlayer in the MainPage.xaml
<Style x:Key=”MediaPlayerStyle” TargetType=”local:MediaPlayer”>

  Width=”432″ Height=”720″
  Style=”{StaticResource MediaPlayerStyle}”

It is a bit more involved to add one of the more fancy templates. It requires creating another ResourceDictionary xaml file, adding the styling from the template Page.xaml and then adding both the generic and the new template as merged dictionaries:

  <ResourceDictionary Source=”generic.xaml”/>
  <ResourceDictionary Source=”BlackGlass.xaml”/>

Removing unnecessary controls, like volume controls, mute button, and misc controls, involves finding the control in the ResourceDictionary xaml and changing Visibility to Collapsed.

Loading Route to Map

The list of node points at each GetProfileData frame was captured in a text file in the first step. This file is added as an embedded resource that can be loaded at initialization. Since there are 19164 nodes, the MapPolyline is reduced using only modulus 25 nodes resulting in a more manageable 766 node MapPolyline. The full node list is still kept in a routeLocations Collection. Having the full node list available helps to synch with the video timer. This video is encoded at 25fps so I can relate any video time to a node index.

private List<location> routeLocations = new List<location>();

 private void LoadRoute()
     MapPolyline rte = new MapPolyline();
     rte.Name = "route";
     rte.Stroke = new SolidColorBrush(Colors.Blue);
     rte.StrokeThickness = 10;
     rte.Opacity = 0.5;
     rte.Locations = new LocationCollection();

     Stream strm = Assembly.GetExecutingAssembly().GetManifestResourceStream("OnTerra_MACCorridor.corridor.direction.txt");

     string line;
     int cnt = 0;
     using (StreamReader reader = new StreamReader(strm))
         while ((line = reader.ReadLine()) != null)
             string[] values = line.Split(',');
             Location loc = new Location(double.Parse(values[0]), double.Parse(values[1]));
             if ((cnt++) % 25 == 0) rte.Locations.Add(loc);// add node every second

A Sweep MapPolyline is also added to the map with an event handler for MouseLeftButtonDown. The corresponding MouseMove and MouseLeftButtonUp events are added to the Map Control, which sets up a user drag capability. Every MouseMove event calls a FindNearestPoint(LL, routeLocations) function which returns a Location and updates the current routeIndex. This way the user sweep drag movements are locked to the route and the index is available to get the node point at the closest frame. This routeIndex is used to update the sweep profile end points to the new locations.

Synchronzing MediaPlayer and Sweep location

From the video perspective a DispatcherTimer polls the video MediaPlayer time position every 500ms. The MediaPlayer time position returned in seconds is multiplied by the frame rate of 25fps giving the routeLocations node index, which is used to update the sweep MapPolyline locations.

In reverse, a user drags and drops the sweep line at some point along the route. The MouseMove keeps the current routeIndex updated so that the mouse up event can change the sweep locations to its new location on the route. Also in this MouseLeftButtonUp event handler the video position is updated dividing the routeIndex by the frame rate.
VideoFile.Position = routeIndex/frameRate;


Since Silverlight handles media as well as maps it’s possible to take advantage of video codecs as a sort of compression technique. In this example, all of the large number of frame images collected from a LiDAR point cloud are readily available in a web mapping interface. Connecting the video timer with a node collection makes it relatively easy to keep map position and video synchronized. The video is a pointer into the large library of LiDAR profile image frames.

From a mapping perspective, this can be thought of as a raster organizational pattern, similar in a sense to tile pyramids. In this case, however, a single time axis is the pointer into a large image set, while with tile pyramids 3 axis pointers access the image library with zoom, x, and y. In either case visually interacting with a large image library enhances the human interface. My previous unsuccessful experiment with video pyramids attempted to combine a serial time axis with the three tile pyramid axis.I still believe this will be a reality sometime.

Of course there are other universes than our earth’s surface. It seems reasonable to think dynamic visualizer approaches could be extended to other large domains. Perhaps Bioinformatics could make use of tile pyramids and video codecs to explore Genomic or Proteomic topologies. It would be an interesting investigation.

Bioinformatics is a whole other world. While we are playing around in “mirror land” these guys are doing the “mirror us.”

Fig 1 – MediaPlayer using BlackGlass template

Codename “Dallas” Data Subscription

Fig 1 – Data.Gov subscription source from Microsoft Dallas

What is Dallas? More info here Dallas Blog

“Dallas is Microsoft’s Information Service, built on and part of the Windows Azure platform to provide developers and information workers with friction-free access to premium content through clean, consistent APIs as well as single-click BI/Reporting capabilities; an information marketplace allowing content providers to reach developers of ALL sizes (and developers of all sizes to gain access to content previously out of reach due to pricing, licensing, format, etc.)”

I guess I fall into the information worker category and although “friction-free” may not be quite the same as FOSS maybe it’s near enough to do some experiments. In order to make use of this data service you need to have a Windows Live ID with Microsoft. The signup page also asks for an invitation code which you can obtain via email. Once the sign-in completes you will be presented with a Key which is used for access to any of the data subscription services at this secure endpoint:

Here is a screen shot showing some of the free trial subscriptions that are part of my subscription catalog. This is all pretty new and most of the data sets listed in the catalog still indicate “Coming Soon.” The subscriptions interesting to me are the ones with a geographic component. There are none yet with actual latitude,longitude, but in the case of’s crime data there is at least a city and state attribution.

Fig 2 – Dallas subscriptions

Here is the preview page showing the default table view. You select the desired filter attributes and then click preview to show a table based view. Also there is a copy of the url used to access the data on the left. Other view options include “atom 1.0″, “raw”, and direct import to Excel Pivot.

Fig 3 – Dallas DATA.Gov subscription preview – Crime 2006,2007 USA

There are two approaches for consuming data.

1. The easiest is the url parameter service approach.$format=atom10

This isn’t the full picture because you also need to include your account key and a unique user ID in the http header. These are not sent in the url but in the header, which means using a specialized tool or coding an Http request.

	WebRequest request = WebRequest.Create(url);
	request.Headers.Add("$accountKey", accountKey);
	request.Headers.Add("$uniqueUserID", uniqueUserId);

	// Get the response
	HttpWebResponse response = (HttpWebResponse)request.GetResponse();

The response in this case is in Atom 1.0 format as indicated in the format request parameter of the url.

<feed xmlns=""
  <title type="text">Data.Gov - U.S. Offenses Known to Law Enforcement</title>
  <rights type="text">2009 U.S. Government</rights>
  <link rel="self" title="Data.Gov - U.S. Offenses Known to Law Enforcement"
href="$format=atom10" />
    <title type="text">Colorado / Alamosa in 2007</title>
    <link rel="self" href="$format=atom10
&$page=1&$itemsperpage=1" />
    <content type="application/xml">
        <d:State m:type="Edm.String">Colorado</d:State>
        <d:City m:type="Edm.String">Alamosa</d:City>
        <d:Year m:type="Edm.Int32">2007</d:Year>
        <d:Population m:type="Edm.Int32">8714</d:Population>
        <d:Violentcrime m:type="Edm.Int32">57</d:Violentcrime>
        <d:MurderAndNonEgligentManslaughter m:type="Edm.Int32">1</d:MurderAndNonEgligentManslaughter>
        <d:ForcibleRape m:type="Edm.Int32">11</d:ForcibleRape>
        <d:Robbery m:type="Edm.Int32">16</d:Robbery>
        <d:AggravatedAssault m:type="Edm.Int32">29</d:AggravatedAssault>
        <d:PropertyCrime m:type="Edm.Int32">565</d:PropertyCrime>
        <d:Burglary m:type="Edm.Int32">79</d:Burglary>
        <d:LarcenyTheft m:type="Edm.Int32">475</d:LarcenyTheft>
        <d:MotorVehicleTheft m:type="Edm.Int32">11</d:MotorVehicleTheft>
        <d:Arson m:type="Edm.Int32">3</d:Arson>

If you’re curious about MurderAndNonEgligentManslaughter, I assume it is meant to be: “Murder And Non Negligent Manslaughter”. There are some other anomalies I happened across such as very few violent crimes in Illinois. Perhaps Chicago politicians are better at keeping the slate clean.

2. The second approach using a generated proxy service is more powerful.

On the left corner of the preview page there is a Download C# service class link. This is a generated convenience class that lets you invoke the service with your account, ID, and url, but handles the XML Linq transfer of the atom response into a nice class with properties. There is an Invoke method that does all the work of getting a collection of items generated from the atom entry records:

    public partial class DataGovCrimeByCitiesItem
        public System.String State { get; set; }
        public System.String City { get; set; }
        public System.Int32 Year { get; set; }
        public System.Int32 Population { get; set; }
        public System.Int32 Violentcrime { get; set; }
        public System.Int32 MurderAndNonEgligentManslaughter { get; set; }
        public System.Int32 ForcibleRape { get; set; }
        public System.Int32 Robbery { get; set; }
        public System.Int32 AggravatedAssault { get; set; }
        public System.Int32 PropertyCrime { get; set; }
        public System.Int32 Burglary { get; set; }
        public System.Int32 LarcenyTheft { get; set; }
        public System.Int32 MotorVehicleTheft { get; set; }
        public System.Int32 Arson { get; set; }

public List Invoke(System.String state,
            System.String city,
            System.String year,
            int page)

Interestingly, you can’t just drop this proxy service code into the Silverlight side of a project. It has to be on the Web side. In order to be useful for a Bing Maps Silverlight Control application you still need to add a Silverlight WCF service to reference on the Silverlight side. This service simply calls the nicely generated Dallas proxy service which then shows up in the Asynch completed call back.

private void GetCrimeData(string state, string city, string year, int page,string crime )
  DallasServiceClient dallasclient = GetServiceClient();
  dallasclient.GetItemsCompleted += svc_DallasGetItemsCompleted;
  dallasclient.GetItemsAsync(state, city, year, page, crime);

private void svc_DallasGetItemsCompleted(object sender, GetItemsCompletedEventArgs e)
  if (e.Error == null)
      ObservableCollection<DataGovCrimeByCitiesItem> results = e.Result as

This is all very nice, but I really want to use it with a map. Getting the Dallas data is only part of the problem. I still need to turn the City, State locations into latitude, longitude locations. This can easily be done by adding a reference to the Bing Maps Web Services Geocode service. With the geocode service I can loop through the returned items collection and send each off to the geocode service getting back a useable LL Location.

foreach(DataGovCrimeByCitiesItem item in results){
   GetGeocodeLocation(item.City + "," + item.State, item);

Since all of these geocode requests are also Asynch call backs, I need to pass my DataGovCrimeByCitiesItem object along as a GeocodeCompletedEventArgs e.UserState. It is also a bit tricky determining exactly when all the geocode requests have been completed. I use a count down to check for a finish.

With a latitude, longitude in hand for each of the returned DataGovCrimeByCitiesItem objects I can start populating the map. I chose to use the Bubble graph approach with the crime statistic turned into a diameter. This required normalizing by the maximum value. It looks nice, although I’m not too sure how valuable such a graph actually is. Unfortunately this CTP version of Dallas data service has an items per page limit of 100. I can see why this is done to prevent massive data queries, but it complicates normalization since I don’t have all the pages available at one time to calculate a maximum. I could work out a way to call several pages, but there is a problem with an odd behaviour which seems to get results looped back on the beginning to finish the default 100 count on pages greater than 1. There ought to be some kind of additional query for count, max, and min of result sets. I didn’t see this in my experiments.

One drawback to my approach is the number of geocode requests that are accumulated. I should really get my request list only once per state and save locally. All the bubble crime calculations could then be done on a local set in memory cache. There wouldn’t be a need then for return calls and geocode loop with each change in type of crime. However, this version is a proof of concept and lets me see some of the usefulness of these types of data services as well as a few drawbacks of my initial approach.

Here is a view of the Access Report for my experiment. If you play with the demo you will be adding to the access tallies. Since this is CTP I don’t get charged, but it is interesting to see how a dev pay program might utilize this report page. Unfortunately, User ID is currently not part of the Access Report. If this Access Report would also sort by the User ID you could simply identify each user with their own unique ID and their share of the burden could be tracked.

Fig 4 – Dallas Data.Gov subscription Access Report


The interesting part of this exercise is seeing how the Bing Maps Silverlight Control can be the nexus of a variety of data service sources. In this simple demo I’m using the Bing Maps service, The Bing Maps Web Geocode Service, and the Dallas data service. I could just as easily add other sources from traditional WMS, WFS sources, or local tile pyramids and spatial data tables. The data sources can be in essence out sourced to some other service. All the computation happens in the client and a vastly more efficient distributed web app is the result. My server isn’t loaded with all kinds of data management issues or even all that many http hits.

Fig 5 – Distributed Data Sources – SOA

Hauling Out the Big RAM

Amazon released a handful of new stuff.

“Make that a Quadruple Extra Large with room for a Planet OSM”

Big Mmeory
Fig 1 – Big Foot Memory

1. New Price for EC2 instances

Linux Windows SQL Linux Windows SQL
m1.small $0.085 $0.12 $0.095 $0.13
m1.large $0.34 $0.48 $1.08 $0.38 $0.52 $1.12
m1.xlarge $0.68 $0.96 $1.56 $0.76 $1.04 $1.64
c1.medium $0.17 $0.29 $0.19 $0.31
c1.xlarge $0.68 $1.16 $2.36 $0.76 $1.24 $2.44

Notice the small instance, now $0.12/hr, matches Azure Pricing

Compute = $0.12 / hour

This is not really apples to apples since Amazon is a virtual instance, while Azure is per deployed application. A virtual instance can have multple service/web apps deployed.

2. Amazon announces a Relational Database Service RDS
Based on MySQL 5.1, this doesn’t appear to add a whole lot since you always could start an instance with any database you wanted. MySQL isn’t exactly known for geospatial even though it has some spatial capabilities. You can see a small comparison of PostGIS vs MySQL by Paul Ramsey. I don’t know if this comparison is still valid, but I haven’t seen much use of MySQL for spatial backends.

This is similar to Azure SQL Server which is also a convenience deployment that lets you run SQL Server as an Azure service, without all the headaches of administration and maintenance tasks. Neither of these options are cloud scaled, meaning that they are still single instance versions, not cross partition capable. SQL Azure Server CTP has an upper limit of 10Gb, as in hard drive not RAM.

3. Amazon adds New high memory instances

  • High-Memory Double Extra Large Instance 34.2 GB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform $1.20-$1.44/hr
  • High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform $2.40-$2.88/hr

These are new virtual instance AMIs that scale up as opposed to scale out. Scaled out options use clusters of instances in the Grid Computing/Hadoop type of architectures. There is nothing to prohibit using clusters of scaled up instances in a hybridized architecture, other than cost. However, the premise of Hadoop arrays is “divide and conquer,” so it makes less sense to have massive nodes in the array. Since scaling out involves moving the problem to a whole new parallel programming paradigm with all of its consequent complexity, it also means owning the code. In contrast scaling up is generally very simple. You don’t have to own the code or even recompile just install on more capable hardware.

Returning us back to the Amazon RDS, Amazon has presumably taken an optimized compiled route and offers prepackaged MySQL 5.1 instances ready to use:

  • db.m1.small (1.7 GB of RAM, $0.11 per hour).
  • db.m1.large (7.5 GB of RAM, $0.44 per hour)
  • db.m1.xlarge (15 GB of RAM, $0.88 per hour).
  • db.m2.2xlarge (34 GB of RAM, $1.55 per hour).
  • db.m2.4xlarge (68 GB of RAM, $3.10 per hour).

Of course the higher spatial functionality of PostgreSQL/PostGIS can be installed on any of these high memory instances as well. It is just not done by Amazon. The important thing to note is memory approaches 100Gb per instance! What does one do with all that memory?

Here is one use:

“Google query results are now served in under an astonishingly fast 200ms, down from 1000ms in the olden days. The vast majority of this great performance improvement is due to holding indexes completely in memory. Thousands of machines process each query in order to make search results appear nearly instantaneously.”
Google Fellow Jeff Dean keynote speech at WSDM 2009.

Having very large memory footprints makes sense for increasing performance on a DB application. Even fairly large data tables can reside entirely in memory for optimum performance. Whether a database makes use of the best optimized compiler for Amazon’s 64bit instances would need to be explored. Open source options like PostgreSQL/PostGIS would let you play with compiling in your choice of compilers, but perhaps not successfully.

Todd Hoff has some insightful analysis in his post, “Are Cloud-Based Memory Architectures the Next Big Thing?”

Here is Todd Hoff’s point about having your DB run inside of RAM – remember that 68Gb Quadruple Extra Large memory:

“Why are Memory Based Architectures so attractive? Compared to disk, RAM is a high bandwidth and low latency storage medium. Depending on who you ask the bandwidth of RAM is 5 GB/s. The bandwidth of disk is about 100 MB/s. RAM bandwidth is many hundreds of times faster. RAM wins. Modern hard drives have latencies under 13 milliseconds. When many applications are queued for disk reads latencies can easily be in the many second range. Memory latency is in the 5 nanosecond range. Memory latency is 2,000 times faster. RAM wins again.”

Wow! Can that be right? “Memory latency is 2,000 times faster .”

(Hmm… 13 milliseconds = 13,000,000 nanoseconds
so 13,000,000n/5n = 2,600,000x? And 5Gb/s / 100Mb/s = 50x? Am I doing the math right?)

The real question, of course, is what will actual benchmarks reveal? Presumably optimized memory caching narrows the gap between disk storage and RAM. Which brings up the problem of configuring a Database to use large RAM pools. PostgreSQL has a variety of configuration settings but to date RDBMS software doesn’t really have a configuration switch that simply caches the whole enchilada.

Here is some discussion of MySQL front-ending the database with In-Memory-Data-Grid (IMDG).

Here is an article on a PostgreSQL configuration to use a RAM disk.

Here is a walk through on configuring PostgreSQL caching and some PostgreSQL doc pages.

Tuning for large memory is not exactly straightforward. There is no “one size fits all.” You can quickly get into Managing Kernel Resources. The two most important parameters are:

  • shared_buffers
  • sort_mem
“As a start for tuning, use 25% of RAM for cache size, and 2-4% for sort size. Increase if no swapping, and decrease to prevent swapping. Of course, if the frequently accessed tables already fit in the cache, continuing to increase the cache size no longer dramatically improves performance.”

OK, given this rough guideline on a Quadruple Extra Large Instance 68Gb:

  • shared_buffers = 17Gb (25%)
  • sort_mem = 2.72Gb (4%)

This still leaves plenty of room, 48.28Gb, to avoid dreaded swap pagein by the OS. Let’s assume a more normal 8Gb memory for the OS. We still have 40Gb to play with. Looking at sort types in detail may make adding some more sort_mem helpful, maybe bump to 5Gb. Now there is still an additional 38Gb to drop into shared_buffers for a grand total of 55Gb. Of course you have to have a pretty hefty set of spatial tables to use up this kind of space.

Here is a list of PostgreSQL limitations. As you can see it is technically possible to run out of even 68Gb.


Maximum Database Size Unlimited
Maximum Table Size 32 TB
Maximum Row Size 1.6 TB
Maximum Field Size 1 GB
Maximum Rows per Table Unlimited
Maximum Columns per Table 250 – 1600 depending on column types
Maximum Indexes per Table Unlimited

Naturally the Obe duo has a useful posting on determining PostGIS sizes: Determining size of database, schema, tables, and geometry

To get some perspective on size an Open Street Map dump of the whole world fits into a 90Gb EBS Amazon Public Data Set configured for PostGIS with pg_createcluster. Looks like this just happened a couple weeks ago. Although 90Gb is just a little out of reach for a for even a Quadruple Extra Large, I gather the current size of planet osm is still in the 60Gb range and you might just fit it into 55Gb RAM. It would be a tad tight. Well maybe the Octuple Extra Large Instance 136Gb instance is not too far off. Of course who knows how big Planet OSM will ultimately end up being.

Another point to notice is the 8 virtual cores in a Quadruple Extra Large Instance. Unfortunately

“PostgreSQL uses a multi-process model, meaning each database connection has its own Unix process. Because of this, all multi-cpu operating systems can spread multiple database connections among the available CPUs. However, if only a single database connection is active, it can only use one CPU. PostgreSQL does not use multi-threading to allow a single process to use multiple CPUs.”

Running a single connection query apparently won’t benefit from a multi cpu virtual system, even though running multi threaded will definitely help with multiple connection pools.

I look forward to someone actually running benchmarks since that would be the genuine reality check.


Scaling up is the least complex way to boost performance on a lagging application. The Cloud offers lots of choices suitable to a range of budgets and problems. If you want to optimize personnel and adopt a decoupled SOA architecture, you’ll want to look at Azure + SQL Azure. If you want the adventure of large scale research problems, you’ll want to look at instance arrays and Hadoop clusters available in Amazon AWS.

However, if you just want a quick fix, maybe not 2000x but at least a some x, better take a look at Big RAM. If you do, please let us know the benchmarks!