Monday, July 7, 2008

More thoughts on the future of the web

This is a continuation of my earlier post titled It's time for a new Web. I wanted to ramble more on the subject.

So fundamentally what we're talking about is bridging the gap between the internet and your operating system. Letting web applications install libraries that have direct access to graphics, sound, I/O, and File System layers. The trick is exposing all of the important interfaces in a useful and secure way. The question, is how low-level do you go and how much freedom do you allow?

How low?

One idea would be to simply take standards such as OpenGL and OpenAL or maybe SDL and make that your base API. Of course you would need additional APIs for other things such as networking and local file storage. Most programers would be more than happy with this level of access. In fact, most would be using libraries that acted on top of OpenGL and simplified and abstracted even further.

But why stop at the level of OpenGL? What if you just exposed the base hardware layer in a way that OpenGL would just be an internet library on top in the sense I talked about in the previous post.

How much freedom?

The question of how much freedom you give the programmer is important. For example, if you just gave direct access to your local filesystem malicious sites/applications could wreck all kinds of havoc on your system. So in this case it makes the most sense to just have a space allocated to the application that is managed by the browser. This space would be insulated from all other applications. The downside to this is that you would lose the ability to interface directly with other applications data. However, I feel that it's a better design to have the application provide interfaces on it's own with some kind of communication protocol.

Another problem with too much freedom is the loss of structure. I mentioned this earlier as well. If the programmer is just thinking about pixels and the pixels are making up text what tells other applications that the pixels equate to ASCII characters? In other words, how do we apply semantics to different concepts so we can do things like Copy-Paste.


One critical part of this vision of the web being a success is to have self-awareness. In other words, an API that lets you do things like query the applications and libraries that are installed.

The idea is that the browser doesn't have a defined UI (though there would be a basic default one). Your home page is your desktop environment. So users would chose different home pages such as Google or Yahoo and those pages would provide the user with their Application links, taskbar, tray, etc.

Of course certain interfaces would need to be defined. Such as the concept of an application being open, it's window properties, user messaging (toaster gui), and other basics. However, these would not be concepts with attached graphical standards though they would often be graphical. What I mean is an application wouldn't know how it was being accessed or represented; it would only know if it was visible or not and its size. (maybe some other info as well, but not much) Of course, the application could attach to events and query the other windows if it wanted to interact in some way.

Part of being self-aware is having an event system that all applications could access. So events would be fired when applications are closed, shown, entered, exited, etc. This would let the desktop app do something like track time spent in each app.

Breaking out of the box

While we're at it why not consider implementing things such as P2P as a standard. Wouldn't it be nice if you just downloaded an application update from your coworker next-door instead of the main site? What if P2P was a standard resource for programmers?

What about user interface considerations. Multiple mice, multiple keyboards, multiple monitors/screens. How would all of these interfaces be provided/queried? How do you abstract their input or do you give the raw input and let libraries deal with it? What about new kinds of devices? For example, multi-touch-pressure screens could be abstracted as multiple cursors but they're really a whole array of pressure values. Do you let the browser layer abstract such a device as a cursor or provide its input directly to applications, or both?

Ending thoughts

Hopefully some of these ideas made sense and you understand what I'm imagining. In a sense this is the holy grail, the thing that would remove operating system barriers and completely standardize computer software while letting programmers achieve anything they could imagine. I don't think it would be easy. You'd need to have a good core group designing the specs and then a marketing engine that could convince people to develop with the new standard and gradually bring it to the mainstream.

I think if such a system were to arrive it would first be like today's browsers. It would have it's own HTML rendering engine and act just like any other browser with the exception that it could do so much more. As developers began releasing internet applications through it and converting existing ones people would live in their browser more and more. There would be a point where it seemed like you had two desktops. (you already have two taskbars with tabs in your browser) People would begin to fullscreen the browser most of the time; eventually operating system distributions would focus solely on getting the user loaded in the browser and not even bother with its own desktop environment.

No comments: