Why AR Will Win — And Why it Matters HOW it Will Win

David A. Smith
17 min readDec 21, 2020

--

“man is much more than a tool builder … he is an inventor of universes.”Alan Kay — “A Personal Computer for Children of All Ages”

There is a race between the extraordinary and compelling Augmented Realities — and the necessary and powerful Augmented Human. It is essential that these be in balance or we will become a slave to the increasingly virtual world we live in rather than the master of it. We consider the idea of a truly Augmented Reality — a digital extension to the world in which we will live all our future waking hours. A far more important concept is that of the Augmented Human — our next steps in evolution that will allow us to understand and control this Augmented Reality and, in turn, the universe. The Augmented Human is a better you.

What is AR

I consider VR (virtual reality) to be a full subset of AR (augmented reality); a mode if you like. I deeply understand and appreciate the difference between the two, but I think that it is as irrelevant as trying to draw a line between smart phone and an MP3/music player.

This is my definition of what Augmented Reality will be. It is concise.

AR will be everything your smart phone is today, but it will be visible every waking second, displaying the world as a living web browser.

This last is important, as it subsumes everything else. Think about how you use the web on a daily, or sometimes even minute by minute basis. Why is this important? Because AR will be like that but so much more. It will be amazingly addictive, and as we all know, addicting products are by far the best market — just ask the cigarette manufacturers.

Imagine an interesting opportunity for distraction every step you take walking down a city street. I am not just talking about restaurant menus here. You can query a quaint hotel you are standing beside and view a list of the famous people that lived … and died there. Interesting that it was once a brothel, even more interesting how that movie star ended up in a closet. Who was the architect, what other buildings did he design? When did he die and what from? What was the Spanish Flu, anyway?

Everything you look at or is near you becomes a trigger for a wonderful exploration of streams of information and ideas. It will be useful in other ways too — you can look down through the pavement and see the subway lines and watch the trains move down the tracks. SimCity comes to life and you and everyone around you is a Sim. You can query for the closest subway stop, which cars are most crowded, are there any sexual predators on the car you are planning to get on? No problem, you have set your system to automatically tag people like that — when you do see him, he is painted a bright red and has a neon sign floating over his head. You can even select his history.

You walk past a car dealership and see a new sedan in the window. You pause to step into a fully immersive (VR) virtual car seat to explore its interior and try out a few features (without leaving the sidewalk). Oops, you are alerted that your red painted guy has just stepped within 100 feet of you.

Time to move on.

Walking through a crowd, you can see Facebook icons pop over the heads of everyone that is a friend of yours on that service. I am talking about what is technologically feasible, not necessarily what society will approve of, but who knows. You see a good-looking person walking your way — cool, you can see that they are only one degree of separation from you on LinkedIn. You invite them into your network.

One of the creatures from a game you are playing appears from behind a car. It sees you and tries to escape, but you corner and capture it.

You get an alert that your video conference is about to start. Walk into the nearest Starbucks — immediately pick up the coffee you already ordered and paid for and sit down at an empty table. The other participants soon join you. Your AR device captures the position of your head, your face and facial expressions and of course where you are looking. Your colleagues look almost as live as if they were sitting next to you. No one else in the coffee shop even notices or cares that you are in a conversation with ghosts. One of the conference participants drops a document onto the table. It is a 3D chart of the projected growth of the new IoT toaster. The new toaster has just integrated Alexa, so it can carry on a conversation with you along with the rest of your other kitchen appliances. You have never met one of your colleagues in person and are not aware that he has removed a mole from his virtual cheek and has a virtual nose job.

The hot new app allows you to transform any of your friends into Hollywood characters, including their voices. You have decided that one of your co-workers will be Humphry Bogart, including hat and trench coat. Your boss is Hannibal Lector from Silence of the Lambs — it makes everything he says so much more interesting. The receptionist becomes Rhett Butler — Clarke Gable with a sort of southern accent.

Most of these capabilities already exist in some form — probably on your phone. All of them will exist soon. They are just a bit inconvenient to access today. That is why you don’t use them every second. You can’t. Yet.

Friction

That sounds like a wonderful world. Why don’t we have it now? When can we have it?

Today, AR and VR are not great experiences. They are terrible. Current devices are complex, hard to set up, even harder to use. They are ugly, heavy and don’t really do much. They do sort of show what is possible once you finally get it working, but at the same time they highlight just how bad the current state of the art is. VR games demonstrate t he potential of the new medium, but even there, it is difficult to spend the same amount of time and attention that you would bring to a screen based version.

Friction is the barrier that your product must overcome to satisfy the customer.

In marketing, the reverse of Friction is, oddly enough, Stickiness. Today, a smart phone is incredibly sticky. AR and VR has a great deal of friction to overcome.

Compare AR and VR to the smart phone in your pocket. AR fails in so many ways.

Phones are beautiful, sleek, elegant — AR makes you look like an alien Bono with an umbilical cord. VR is worse, as not only can’t you see the world, the world sees you with a literal box on your head.

Phones are invitingly seductive and like to be touched. The user strokes their finger across the screen, creating wonderful ripples in their personal probability pond. Interestingly, phones are quite useless for fine manipulation. It is very hard to edit a document on the phone — almost impossible to select between letters in text. That fine manipulation is very important as we will see later. AR and VR have pointing rods with almost no haptic feedback at all aside from perhaps vibrating. It has none of the compelling qualities of the phone interface and is not even particularly good at gross manipulation. I know some people do amazing things in VR. Frankly, I am amazed that they can do that. I can’t. Picasso could draw amazing things in the air too. But he was Picasso.

We need a new way of interacting with this new idea space in the same way that the mouse was invented to be able to interact with our virtual desktop.

Phones are relatively inexpensive and self-contained. A great phone costs almost nothing, a high end one is still amazingly cheap for what you get. Consider the power it provides when you have it in your hand. You don’t need to plug anything into it or plug it into anything to use it. It is a complete and efficient distraction. AR and VR usually require a wire with a hefty, expensive computer on the other end. There are cheaper alternatives, but there is a significant gap in capabilities.

Phones have so many interesting things you can do with them. Communicating with your friends, watching videos, playing games. VR and AR are still emerging markets based up some very flawed hardware. Developers consider the size of the target market for their apps, and though the phone today boasts billions of users, the user base for VR and AR worlds is in the relatively inactive millions.

Phones are instantly available and invisible. Your phone is in your pocket or purse. It alerts you that it wants attention and you have it turned on and in front of your face in just a few seconds. Most people do not remember taking their phone out to use it. It just magically appears in their hands and takes over their focus­. You never think about using the phone when you are using the phone — you see through it to the rich world on the other side of the screen. On the other hand, you have no choice but to think about AR and VR when you are attempting to use it from the time you set it up, to when you are engaged and immersed within it to when you try to do anything within it. It forces you to see the device as much as the world it is trying to present.

This last is very important. In a way, phones own you more than you own them. They constantly cry for your attention, and when they have it, they study you to determine how best to continue to keep you occupied. The phone is literally using the data it gets from you to figure out how to make itself even more addicting to you. Now that is a great drug.

If phones are such a powerful force that we can’t resist them or maybe even live without them, why does AR matter?

Because phones only OWN you. AR is everything a phone is but will BECOME you.

A Better You

AR has some serious challenges but not one of them is insurmountable. What would AR devices be like for it to offer a serious challenge to the phone?

Most of all, AR must be friction free. It must be and will be easier to use than a phone. You just put it on your head and leave it there — a lightweight pair of glasses that look just like those you might be wearing today. You no longer search all your pockets to find it and then swipe your fingerprint or enter a code or stare deep into the eyes of the phone — the camera — hoping you look somehow like you did when you got to know it. AR knows when you want it. It hears you when you speak. It knows what you are looking at. So be good for goodness sake. It will seem like you just think something, and it immediately appears. That is an invisible interface. Your phone is a boat anchor compared to AR done right.

AR must do everything a phone does now. AR devices will be phones — or at least a seamless extension of them. In turn, Will Wright pointed out that technology is an extension of the human body. If somebody hits your car, you don’t say that “my car was hit”. You say “someone hit me while I was driving”’. The car becomes “me”, an extension of your body. Your phone is an extension and a reflection of you in the same way, but AR is far more intimate and compelling. It is you.

AR must look cool. Or it must be worn by someone that is cool so that you think it is OK to wear it too. We are so shallow. It also needs to be lightweight, as well as cool (as in temperature).

noun: synergism

1. the interaction or cooperation of two or more organizations, substances, or other agents to produce a combined effect greater than the sum of their separate effects.

When AR is really great, and it will be, you are going to put on your AR device in the morning and wear it all day long. It will be like hearing aids or dentures — or glasses. Indeed, normal glasses (and in-the-eye glasses from cataract surgery) augment your vision so that you can see the real world. Without them, you are blind, if not totally, to a debilitating degree. The same thing is true of a light bulb in a closed room. Nothing is visible until you turn the light on. As McLuhan says: “a light bulb creates an environment by its mere presence”. Does the room exist without the bulb? Yes, but you could not see it, and would have a great deal of trouble interacting with it. But you never think about technologies like a light switch — they are friction free and have disappeared in a sense, though indeed they redefined what you are at a fundamental level. The digital world already exists — and is as much a part of our existence as the physical, but we are quite blind to it. We see glimpses of it in our offices and labs every once in a while, (at least I get to), and there is no question that the virtual light will soon be turned on. And just like with our glasses and light bulbs, the medium will seem to disappear, but in fact will redefine what we are.

This thing is far stickier than a phone. It is going to be the fusion of man and the Internet. The line between the two is about to be erased. And it is going to be glorious — and terrible.

The Bus or the Bulldozer

Though it is rare that you use a computer to think, you never use a phone that way. You are a consumer of information, not a creator. Part of this is due to the phone’s limitations. You simply can’t be truly creative as a human today without fine manipulation. The phone interface is designed around gross manipulation because touch screens don’t allow you to easily specify a touch point — it is a touch area. This means the interaction targets on the phone must be larger to accommodate the gross manipulation that you can do with it. Without fine manipulation, it is very difficult to create, but virtually impossible to edit anything interesting. The vast majority of effort in creating anything interesting is in the editing, re-working, re-thinking. Things that you just can’t do with a phone. This doesn’t mean it is impossible — we are often impressed by some beautiful work that was created completely on a phone. Impressed not just because the creative effort is better — but because it could be created and completed that way at all.

That limitation in creativity is partially due to the inherent limitations of the design of the smart phone. It has a small screen and you have big fingers. But the display itself is small, which means that you don’t really have the room to express yourself if you wanted to. Real artists need room.

What is quite interesting is that this limitation became something of a feature — as far as the phone manufacturers saw it. The phone is not a very good creative device. However, it is an amazing consumer device. Consuming has two parts. Select what to consume — then consume it. We don’t need a particularly fine interface for that. More problematical was Apple Computer’s decision to enshrine this “feature” and to not allow any kind of programming environment on their phones at all. In fact, Scratch — a block programming language designed for children that works great on a phone — was blocked from the app store for many years until recently. This was an attempt to enforce the consumer nature of the device and it succeeded.

Phones are designed around the idea that you are a consumer of information and ideas. You can control where you go, but you can’t go anywhere that the phone hasn’t already anticipated. This is like getting on a bus. If you already know where you want to go, this is a convenient and fast way to travel there.

However, you can’t take the bus to someplace that is new and unexplored. For that, you need a more refined and powerful set of tools. You are better off with a bulldozer, something that is responsive to new ideas and lets you go anywhere you want. There are no barriers, except your imagination.

That is the true promise of AR — assuming it can escape the challenge of gross manipulation versus fine manipulation.

Amplifiers

Why is it necessary to use AR and VR as creative platforms? How do we ensure that the human is empowered to extend, explore and share their ideas and their worlds? How do we amplify the properties that define humans as creative tool builders?

First, we can’t really understand how AR should work until we can create and modify the AR system from within it. It must be a creative amplifier. Our creations extend the capabilities of the system which then allows us to create something even better. The problem of AR today is that it is being built from the outside — like a ship in a bottle, with the intent that it can then be launched and won’t sink. It might be beautiful indeed, but you are coming at it from the wrong direction. Instead, we need to launch a raft with just enough capabilities so that we can design a ship as we learn to navigate the ocean. We will design the right kind of ship because it is the one we will be living on. We need to invent the VR and AR interfaces from within it and as we use it. This is not a new idea — it was the foundation of what Doug Engelbart demonstrated 50 years ago, create tools that allow you to create even better tools. Xerox Parc created the modern UI for computers and phones from within Smalltalk. Smalltalk allows the developer to reinvent every aspect of the system, from the foundational operating environment underneath to the actual widgets that the user can manipulate. Almost everything you use on a modern computer or phone was invented at Xerox in this way. I believe it is the only way to create a truly new and powerful platform. Will Wright describes it this way:

“We now, with these little micro-worlds, have the ability to basically externalize what is in our imagination and share it with other people. You know, it used to be that you had to have a very rich skill set, like you had to be a fine artist to do that — you know, to paint something in your imagination and then share it with other people. But now with these tools, the creative leverage they give us, average, casual game players have the ability to externalize, create things out of their imagination, share it with other players, and actually have these shared imaginary worlds. And so I think that’s one of the examples of the computer giving creative leverage — a creative amplifier.”

Second, we need a vehicle to amplify our intelligence, both individually and collectively. Humanity is in a race with itself to determine its fate. We may be losing this race. We need to provide the right side with an unfair advantage that it very seriously lacks today. We may not be able to get there from here, at least not directly. Doug Engelbart demonstrated a new starting line in this race 50 years ago. He not only showed many of the foundations of human computer interface that we now take for granted, he also provided us with a new perspective on what the computer could really mean as an extension of a human. This was not an accident, but a pre-planned intentional act to redefine the nature of how the computer and human engage to become something greater than their parts. A synergism that defines the symbiont.

Even more spectacular was how he harnessed the newly empowered symbiont to include other humans, equally empowered with these technologies, so that they could collaborate to explore and create even greater things. His goal wasn’t to merely amplify the intelligence of a human — he intended to amplify the intelligence of humanity. For AR to truly shine, we must also look at it as a fundamentally disruptive event. AR isn’t a better phone, though the first popular versions of it will likely embrace that approach. They will be the first demonstration of what Augmented Reality has in store for us — for better and worse. We need something far more powerful — a tool that allows us to think, explore, and create new tools that amplify our intelligence even more. Engelbart’s tricycle is a great analogy to the problem. Extending the phone into AR, with its inherent limitations and biases is like improving the capabilities of a tricycle. The new tricycle will be based on user feedback and will be a very compelling product, very stable and safe. It offers an awesome user experience, and is what everyone thinks they want. But that process will never result in a bicycle, a far more powerful product that dramatically multiplies its users speed. This is a disruptive product and requires what Clayton Christensen referred to as “discontinuous” innovation. Apple used to refer to the Macintosh as a Bicycle for the Mind — and indeed the Macintosh provided many of us with an incredible vehicle to construct new realities with. The Mac was NOT a better Apple ][. It was a very new thing. Our challenge is to ensure AR is done right.

Third, we need an intention amplifier. We need a new approach to user interaction in the 3D world that provides us with the free-form gross gestures that we have, but also the fine control that we get with the mouse, another Engelbart invention. The mouse was not an accidental design –Engelbart explored many alternatives for interfacing with the computer. The mouse was significantly better. The mouse amplifies the user’s actions while retaining control. Even on a large, multi-foot display like the one I am using, I can move my mouse from one side of the screen to the other and still be able to place the cursor between two letters to edit this document. The mouse amplified my reach but maintained the necessary fine control.

We have far better sensors than Engelbart had in his day. An equivalent to the mouse is certainly within reach. The body and hand are obvious targets for enhancement, but it is a mistake to simply project the bodies motion into the 3D world. Great user interfaces are invisible. The user thinks about what they want and the necessary actions to accomplish it occur. You don’t think about using a mouse — you think about what you want to be true. The best interfaces for AR will be based upon eye-tracking and voice. Hands, finger motions and gestures will aid that interface in the way meta keys aid the keyboardist, but these will be exceptions.

Fourth, we need a control amplifier. Facebook and Google make money by aggregating large amounts of attention and selling it to the highest bidder. Further, they train, or allow others to train the users to desire certain things. The main goals of machine learning on these platforms is to understand what people want and to then provide it to them. We see an almost real-time morph into these machine learning systems exploiting a user’s weakness to train them in what they might desire. Manipulating human behavior is a game, and Machine Learning systems are extremely good at games. The alternative is that the machine learning systems be harnessed to enable the individual user’s creativity and exploration. It should be a full partner in engaging, constructing and exploring new virtual universes. It needs to become another vehicle for fine manipulation by the user, an even sharper blade for our bulldozer enabling us to drive anywhere we want.

We are in a race between the extraordinary, compelling and addicting augmented realities — and the necessary and powerful augmented human. It is essential that these be in balance or humanity will become a slave to the increasingly virtual world we live in rather than the master of it. Augmented Reality is the place where we will live all our future waking hours. The “real” world as we know it will still be the digital world will co-exist and be mixed-in with it and be just as relevant as a light switch in a dark room is today. The digital world exists — you just can’t see it yet. The Augmented Human is our next step in evolution. We will understand, control, and extend this augmented reality and, in turn, the universe. We are about to turn on the light.

A version of this paper was originally printed in Michael Swaine’s PragPub. https://theprosegarden.com/

--

--

David A. Smith

AR, VR, AI, 3D Pioneer I invented 3D portals and crates in games. I wrote the first 3D adventure/shooter, and created Croquet - it redefines collaboration.