Graphic Recording

I didn't know it, but all my life I've engaged in "graphic recording" when it came to exploring new ideas or learning. I never went as far as the artists who made a series of recordings for the sustainable agriculture and food conference, but my subjects were technical, and I was a technical kid growing up, so my "confections," as Tufte calls them, were more mathematical, graphical and textual in nature. I used them to illustrate things to myself, like working out visually how cycles represent waveforms in musical instruments. Now, I see them as graphic recordings. I was a bit ashamed of them, since I thought it meant that I wasn't a good learner and tried to suppress or limit them. That was a mistake.

The drawings are simply wonderful and I got put onto them by Brenda Dawson who tweeted about the graphical recordings made for the March 29 2009 conference
Inaugural National Symposium on Food Systems and Sustainability at the University of California, Davis. How much better a "presentation" these graphic recordings make than a PowerPoint presentation!

These drawings are a lot like my vision for an information system, called Strands, which would be as thick and filled with complexity as the Talmud and as visually expressive as these graphic recordings. If only the web could be like this. When I think of Twitter and Tabloo, if they could be combined, I think we'd be close. Tabloo enables users to create visual narratives (through the structure and relationship, size and aspects of images) and Twitter enables users to create conversations out of small fragments of thought flowing continually.

Labels: , , , , , , , , , , , , , , , , ,

Biological Construction and Networked Content Creation

The order and symmetry of biologically created structures, such as an egg or the human body, are expressions of how correctly those biological systems worked to construct the natural artifact. Biological organisms are collections of cells cooperating with each other. The order and correctness is an expression of the successfulness of the collaboration.

An egg comes out more egg-like when the biological processed working to make it cooperate and collaborate more correctly in its construction. I believe this has implications for the collaborative processes operating in networked software development and information science. The biological process of construction is inherently different than the one humans have inherited from their tool making and industrial heritage. What will we make of it?

Labels: , , , , , , , , , ,

Where are we going?

The issue of whether people should pay for forums or not came up on dpreview. With the current economy, I expect how to pay the bills will be a growing question for many web services.

The problem is with forums there is perfect competition. Anyone can setup a forum and run it for next to nothing. If one forum decides to charge a fee, the users can flee to another forum. The only reason they might stay is because of the audience. For example, photographers pay for to host their photographs on Flickr primarily because it provides a rich audience of people who love to look at still photographs. Flickr is the Life and Look magazine of our time, it is the revival of the great picture magazines, not because of its technology (that helped orient the site in the right direction to succeed, just look at the abject failure of Picasa to be social---too little too late). Flickr just happened to be where most people who like to look at pictures gathered, mostly because of its blog-like streams of every changing pictures and social tools. It is easier to pay a small fee to use Flickr (perhaps even to "read" it) than it would be to overcome the "capital" costs of changing sites. Flickr users have a lot invested in Flickr and it might just cost less to stay and pay than to move elsewhere. Besides, there is no where else to move. The closest thing I could see to Flickr would be for every photographer to put up their own photo blog software and then join photoblogs.org, which would become the "magazine" and "social hub." This is a distributed vision of photo sharing online. I used to wonder which would be successful. But it really was simple, Flickr did it all for you, some for free, a little more for pay, well worth it to promote your photography.

Despite the somewhat juvenile and absurd environment of Flickr with regard to art photography (you know, the dozens of people giving out "Great Photograph" awards to pedestrian, derivative and mediocre images mostly to promote themselves or because they are too young to know what a derivative image is), it is useful to professional photographers and art photographers because Flickr is where the eyeballs are. It attracts people who still love still photography, which in this age of video, is a bit of a miracle that anyone takes an interest in photography. However, photographs can make the world sit still long enough for people to pay attention, and that is a very similar experience to poetry, which at least in part, is there to draw attention to things. I've heard from professional photographers they get an order of magnitude more requests or work through Flickr than through one of the professional portfolio sites.

One reason, perhaps the principal one, Henri Cartier Bresson and other great photographers became well known, was through their images being published in the great picture magazines. When television came along, the picture magazines went into decline. Photojournalism began its long decline at this time, for the simple reason people could learn about their world visually through television, a more attention grabbing (the barrier to entry for television was lower, you didn't have to be intelligent to watch it, a good example where low barrier of entry is destructive to society) and free medium. Without the picture magazines it was no longer possible for a photographer of acknowledged artistic merit to become known and their images have significance in society. The audience was gone. Flickr reestablishes this audience.

So the question still stands. Will people in the future pay for their online content. Pay to create it. Pay to consume it. What is happening now? People are already paying to create content. They pay for a Flickr account with better tools. They pay for services to create graphics, three dee art, property in virtual communities. A few sites charge for reading content, but not many. But given human history and the recent past, when most content was paid for, in newspapers, books and magazines (except for tv), it seems reasonable to assume the free ride will be over someday.

There may be a tipping point when a non-pay site is no longer competitive. When most good content has gone to pay sites and the community of interest for that content willing to pay is consuming all they can (this is what happens with books and magazines today), the other sources will be driven out in a kind of perfect competition. The free sites will be filled with garbage and what passes for content on local cable access.

The network is not the old traditional world of libraries and publishers. It will be different. Project Gutenberg. Open source projects. The collections of enthusiasts sick and tired of the crap shoveled out by the traditional content and software businesses have taken it on their own to produce quality products where the marketplace would not or could not. This is an order of magnitude different than the pre-networked world, where people could not work together, providing little bits of effort or expertise to collaboratively create a cultural artifact. This is entirely new and we don't know where its going.

As an aside, the idea of tipping or donation comes up. Frustrated with no way to fund my original website, I considered taking a modern high tech variation on the PBS approach. I considered (in the 1990s) creating a content management system where each article would have a countdown timer displayed like a reverse donation thermometer. If you didn't contribute something to the article, it would count down, when it reached zero, the page would be pulled from the site. Of course, the ability to cache networked content presents a threat to such schemes, the wayback machine can regurgitate considerable missing content and so can the Google search cache. What about caching? If the Wikipedia were to dry up funding and blow away today, would its content still remain available in a myriad of niches around the network? On people's computers, disks, servers here and there, in caches? Would it evolve another life in a peer to peer environment? Will all information become distributed over billions of cell phones and have no location at all?

Labels: , , , , , , , , , , ,

Blogging the Archives

A vital interest of mine is access to archives. I've been interested in the possibilities inherent in the web and network for increasing access to archives and enabling a greater number of non-academics to browse, organize and surface archive holdings. One of the most significant ways of exposing the holdings of an archives is blogging the contents.

We really haven't got there yet, but I've noticed a small trend, which I hope signifies the beginning of exponential growth, of people blogging artifacts. I do not remember the first site I came across where a blogger was posting pictures of artifacts, usually photographs from an online catalog of a museum, but here are some recent finds.

Illustration Art

All Edges Gilt

If we could just get every artifact in the world's museums and archives photographed or scanned and online, give the tools to blog the contents to millions of ordinary people interested in telling the stories of these cultural objects, think of how rich that would be. I don't know if people will do this, but I do know that ordinary people have a lot to contribute. Academics cannot know everything, they are an isolated individual, no matter how expert they are, and there is a very Long Tail out there of family members, amateur historians, hobbyists and who knows who that know something about cultural and historic artifacts. Maybe they will be willing to contribute. It will likely be only two percent, like Wikipedia authors, but that small percentage can do a lot of good.

As an aside, author and developer Liam Quin has a site, fromoldbooks.org which has great potential to provide fodder for bloggers. The interface to this digital archive of old book scans is easier to use and better than ones I've seen institutions deploy.

I wonder, also, if this phenomena is not somehow similar to the Cinematheque, not just an archive, but concerned that people actually view or interact with the artifacts.

Update: Shorpy is a commercial site, which shows  how successful blogging the archives can be. The site appears to have developed a following, with, I imagine, readers checking in each day to see what new photographs are posted. The blogger acts as curator by selecting images that will be of interest to the readers. Arranging them into albums, possibly by narrative (using Tabloo would be a good way to achieve this).

This fits exactly with the idea of people being able to easily find images of their local area in the past and the idea of "blogging the archives" at its most simplest and effective. The power of simply posting images and their captions, without any commentary, is surprising. It is encouraging to see people are interested and willing to participate in the interpreation and "unpuzzling" of old photographs. One of the pleasures of old photographs is rediscovering what lies behind the mysteries the images present.

Labels: , , , , , , , , , , , , , , , , ,

Why Tag Clouds are Beating a Dead Horse

Tag clouds are dead. I don't want to mince words. I've been waiting for a long time for someone to say so, to let everyone see the elephant in the living room. What interests me is why tag clouds are dead.

About ten years ago I was working on a prototype web application. It never saw the light of day. But it was called Strands and consisted of a wiki-like content management system that allowed anyone (it was based on SoftSecurity) to create pages, to post and edit content. Any author could include single keywords in the text. These would be automatically scooped up and entered into an index. You could display the posts associated with (containing) any keyword listed on a page like search results. The idea was that content could be navigated in any number of ways according to keywords added by users. It's wasn't social. It didn't know the user who contributed the keyword. The idea was to destroy hierarchy and create a user centered order to information, something close to the folksonomy (but not quite because it didn't care about who submitted a keyword). One version did not allow linking between pages, no "wikiword" links, the idea being that all navigation was by keyword links, either in content or on the "strand" pages listing all content belonging to a keyword.

One of the other ways of navigating considered was by popularity of keyword. The system could generate a list of keywords based on how many posts contained or were associated with them. You may start to find the elements of this system familiar. "Strands" are posts listed by tag. Keywords are tags. Navigating by popular keywords is a tag cloud. The ideas for this system partly developed out of work I'd seen on the web where posts were ordered by single keyword. The other reason was I have a terrible time categorizing anything, I can't decide which category something could go in. I am incredibly bad at and hate categorizing anything, so I decided the wiki element would let visitors to my site categorize my junk for me.

If this were not a blog, I'd spare you all this personal history, but it does show you why I am interested in the question of why tag clouds suck.

When I visit a website with a tag cloud, I tend to pay close attention to it. I noticed that I never bothered clicking on them, never used them. When I thought about why, one of the things I noticed was that nearly every tag cloud consisted of a number of large tags I could count on my hand, and the rest were undifferntiated in size. One of the solutions that came to mind was displaying tags by popularity on a logarithmic scale, which could help increase the difference between the less popular tags. I'm not that great at math, so I would need to leave it to someone else to work this out. But the idea is to create greater differentiation visually among the less differentiated tags.

The other problem with this is there are only so many font sizes that are easily usable on the web. This worsens the differentiation problem.

The other concern I had devevloping the keyword based application was that chaos would ensue. People tend to prefer order. Would it help or hurt for people to be navigating by tag? Tags don't always apply to the subject. Their strength is freedom, freedom from controlled vocabularies and rigid meanings, but without those restrictions tag-chaos can reign. Wikis always had a kind of randomness to them and so do tag structured and navigated content.

I almost never click on tags in Wordpress blogs for this reason. It usually produces a result that widens not narrows my search. Nielson observed that clicking on a link has a penalty, and the trouble with tags is they have an uncertantity penalty.

The closest I've ever seen to a realization of the keyword based navigation idea is a photo gallery developed by Alex Wilson some years ago. You can see it still in operation here. It's a great idea and an excellent implementation, I don't know why I didn't go ahead with my own version instead of abandoning it (doubly, since the eventual goal was for organizing photographs). It makes the homepage a tag cloud and each detail page with a photograph displays a vertical row of thumbnails to photographs linked by tags, which is very similar to the way the Strands pages listed posts according to tag (like Flickr pages with the tags next to the image). Alex recently switched to a standard gallery system for this exact reason, that visitors and customers apparently found the tag-navigated album confusing.

I love tags. I use them like I feel they were supposed to be in this blog, I just write any significant word that comes into my head about the subject. I don't care that they create long lists of tags, since I only use them as a memory aid. They are terrible for people navigating the site and categories would probably be better. Tags aid memory, they aid discovery and exploration, but I'm uncertain that they are good finding aids.

I'm sure others have observed this before, but I've kept quiet about it, so I may be late the party, but still, it's a useful discussion, to dissect why tags ultimately fail to live up to the (strange to me) hype they received. Every new web technology seems to be annoucned like the second coming.

So, yes, tag clouds are beating a dead horse. Even the little sets of tags next to blog posts don't really do much for me, not even on my own site, or they don't seem to do much for visitors in my view.

The other thing that tortured me developing the keyword based navigation was whether to allow spaces in keywords, which would prevent combining keywords like chicken+soup and create confusion (sepearte keyword threads of navigation) between "farmers market" and "farmers_market." I worried a bit about misspellings, but not too much since I didn't like controlled vocabularies.

References: Tag Clouds_Rip and ZigTag supposed to solve these problems.

Labels: , , , , , , ,

Capturing and Refining User Expertise

One of my longtime interests has been how to create a system that captures the knowledge of experts and refines it into a single resource. I was attracted to wikis early on by their communal authorship, but found the lack of structure unsuitable for my needs. What I wanted, for two of my early efforts, one a site intended to help family photography historians answer questions about old photographs and the other a site for programmers to find help with coding questions, was a way to let users engage in a Q and A and then somehow capture and distill the expertise into a more traditional article format (like a wiki page), which could be maintained by everyone. I wanted to capture the expertise emerging from the group discussion through some mechanism.

I ended up developing a content management system for the coding site, which had the ability to "fold" a comment thread attached to an article back into the article for editing. I also developed a tool, which could take a forum thread and turn it into an article text for editing. These solutions required a lot of manual effort to whip the unruly comments into a coherent article.

All along I wanted to introduce the communal editing feature of a wiki to this process, but I faced the obstacle of how to overcome the distinction between communal content and content owned by the user posting it. I racked my brains to design the system to somehow enable a transition from personal content to communal content, so that question and answer sessions centered around a code example or problem, could be "folded" into a more communal source of information, refined and with conclusions. But never found a solution.

Originally, I had wanted to develop my coding help site as a Q and A site like Experts Exchange. This explains why I needed some way of converting the knowledge captured by the Q and A session, if there were a solution, into an article form. A QandA session usually results in exposing a lot of valuable knowledge from experts. I wanted a way to capture and refine this so people could learn to code better from it.

Stackoverflow.com a Q and A site for coders. It is simply excellent in design and execution. What fascinates me most is their concept of a "Community Post." When a post is edited by more than four users, it it promoted to a Community Post, which is editable by every user and no longer belongs to the original owner. Apparently, they use a wiki-like versioning system for their posts, so the original post is owned by the original posting user, subsequent versions I suppose are owned by their editors (the user who revised it), and after four unique edits becomes the property of the community.

This mechanism provides a smooth transition from traditional _authorship_ to the communal writing style of the wiki where the community is the author and authorship is anonymous. I wish I had thought of it, since the original idea for my site was a "code wiki" that would not just provide solutions to programming questions but help coders learn from the results and improve their skills. I don't want to rehash my failures with phphelp.com, but to highlight an innovative way of providing a smooth transition between individually owned and communal content.

One of the questions raised by this is authorship. People like attribution because it builds their reputation. So in a wiki environment, they lose their attribution. A user's post becomes a community post. So what happens to a user's credit? One solution is to create an indirect proxy for credit in a communal authorship environment, so that good authors get "badges" or "reputations" that they wear independently. Instead of a "byline" for your post, you get a badge representing the amount and effectiveness of your contributions.

Which is better? Everyone owning their own content or communal content? It really depends on the audience and goals of the site. Some people prefer to own their own content and share it. This is how most social media sharing sites work. You own your content and your friends own their content and the site provides a way of sharing it. Social bookmarking sites also enable users to keep their own content separate from others and then the content is mixed and matched through tag navigation. A wiki-style system generally views content as communal. Stackoverflow solved this problem with a novel mechanism for transitioning content from individual to communal status.

It occurred to me this mechanism might be valuable in a so-called bliki system, which is a blog and a wiki combined. In a bliki, users create quick, timely posts like blog entries connected to dates, but they can also edit the content of posts to create and reference wiki pages. This enables users to make quick sketchy entries like a blog, but then later, reflect on those entries with longer posts. This is called "quick-slow" in bliki terms. What if this process could be facilitated by automatically transitioning the "quick" blog post into a "slow" wiki page? Instead of making a blog post then creating a wiki page linked to it with extra information, the blog post would at some point transform itself into communal content, from blog post to wiki page. Authorship would still be retained because each post would still exist in the wiki history. Anyone could go back to the original blog post to see who posted it and what it was about.

Labels: , , , ,

Social Realms: Sharing and Publishing Become One

There is an increasing recognition of the importance of 'social realms' within the context of social networking. Some social sites started out as "walled gardens" where only friends could see social content a user posts. Other sites started out with all content posted being public like a graffiti wall. Social site builders are now recognize there should be many fine graduations of control over viewing and sharing social content. These social realms extend out from the user in concentric circles, from the being able to see their own content ("me"), to friends, to friends of friends, to networks or groups of friends, and finally to the public.

Blogging was always seen as a form of publishing. The new systems emerging now are centered around "social blogging" or "social news feeds" and are called by various names. Facebook merged their "wall" application and their "mini-feed" application in a single feature called The Wall, an example of one of these new forms for facilitating social interaction between small groups of friends in an asynchronous manner (as opposed to chat or telephony). Like Twitter and Jaiku, they enable "social peripheral vision" or seeing what your friends are doing and passing brief notes back and forth to keep in touch or coordinate activities. These posts are not publishing in the traditional sense and are not considered publishing, since in theory, the posts are intended for friends (although some sites offering these services create a kind of public feed everyone can see).

The Wall on facebook has all the elements of Jaiku or other similar sites, a series of blog-like posts limited to a brief snippet of text in reverse chronological order with the ability for users to comment on them. What makes them social is that the posts are seen by your _friends_ who are the only ones who can comment. So you could post about going to the farmer's market on Sunday and a friend could comment by asking you to pick up some tomatoes. Another friend could comment they will be at the same market and will meet you there. Comments are an important feature because they enable individualized topical conversations. If friends could only post to the "circle of friends" feed, the conversation would become disjointed. Social posts are the start of conversation.

This just emphasizes the need for social realms that determine the scope in which social content is accessible. Facebook offers several social realms for Wall posts, your own, your friends, your friends of friends, your network of friends, the public.

The last is interesting, because it brings us full circle. Most platforms were publishing platforms before the social networking craze, then there emerged platforms for social sharing but without any publishing. Now the two platforms are converging into a single platform for sharing with granular control over the social realms into which any piece of content goes, from sharing with a circle of friends to publishing to the whole world and every gradation in between.

Publishing has a completely different feel to it than social sharing. It requires different tools, ones which facilitate authorship, but have no need for defining the social realms in which the works of authorship will be consumed. I had watched the emergence of Twitter and Jaiku but failed to see their signficance, since their posts were so brief. I saw them as being limit blogs, and idea I had toyed with in the late 90s, but bloggers were more interested in longer and longer posts, being literary types. They were interested in publishing. It was finally understanding the social use of these short-message systems (it is no accident the popularity of SMS correponds with the popularity of these small message blog-like systems) to keep people in touch socially that I understood their usefulness. It makes little sense to critcise the inane or brief posts to Twitter as not contributing to human knowledge or letters, the purpose of these sites, as it is said of Jaiku, to maintain social peripheral vision (something I didn't even know I needed and still feels uncomfortable in the "buddylist 24/7" way it is presented). Maybe someone should start a site called "Tome" for long posts of intellectual brilliance contributing to the total of human knowledge, a mirror image of Twitter. Or perhaps that was what Blogger was supposed to be.

The convergence between sharing and publishing, which began with the original c2 wiki and the lowering of barriers to a read/write web, is emerging as a powerful new metaphor for interaction. Publishing will come to be seen as just sharing with everyone. All content, all media will be social and social realms determine the intended audience.

At farmfoody.org, we will be moving quickly to provide our users with this kind of close-knit interaction, which eschews the private message metaphor derived from email and the blog metaphor from publishing. A graffiti wall is too public and random to be of much use, private messages are stultifying and open to abuse since anyone can send a private message across social realms. The blog was intended for publishing, the feed for syndication, but this new format, the social feed or blog, converges sharing and publishing into a form easily digestible and controllable by users.

Labels: , , ,

Namespaces for Tags

I've been thinking about "namespaces" for tags lately. Sometimes tags become too random, disorganized, or numerous to be relevant or useful. One way of cutting through the clutter is to create more than one set of tags. I've seen this on at least one website, sprig.org, which offers "togs" or an alternative set of tags to classify posts by. The difference is these tags are restricted to a particular concept, types of ecology-related terms, such as "organic." What this secondary set of tags produces is in reality a set of tags under another namespace "Ecology."

It is possible to organize tags into namespaces, each representing a concept. This would not be imposing hierarchy on tags, but creating nodes representing concepts. So that Ecology might contain organic, carbon free, sustainable, etc. and Mathematics might contain number, equation, factor, etc.

I organize my photographs in Photoshop Elements using tags. I chose to avoid using tags like categories and instead only create tags for qualities of the image. I try to create tags that describe the image the way an art historian might classify works by their elements or an archivist might classify images according to social use. An image depicting people at work is an "occupational" for example. A painting might be "abstract" and "nature" and "patterns."

Here is a partial list of my tags. I try to create tags for

a) Qualities of art, such as Landscape or Pictorialist
b) Things that can be seen in photographs, concrete like Aircraft or abstract like Patterns
c) Subjects, categories of subjects, concrete like Nature, Sky or abstract like Time


Abstract
Aircraft
Automobile
Birds
Butterflies
Concrete
Flowers
Impressionist
Landscape
Leaves
Nature
Patterns
Photos
Pictorialist
Plants
Rain
Shadows
Sky
Snow
Time
Trees
Urban

I can see some benefit in putting these in a namespace, limiting the tags in this space to reduce clutter. For example, tags on Buddhism would not be found in great number in this set (unless a) you have a lot of Buddhist photography or b) you attach tags from a Buddhist namespace and then they wouldn't be in the set). I don't know how successful namespaces might be for tagging. Programmers love namespaces, but ordinary people find them confusing. I like the idea of tying namespaces to concepts.

I think namespaces would come in handy when choosing tags from a list, like when you show all labels in Blogger's interface. You get one long unreadable list of every tag you've used. Sometimes I love tags when I can just enter the key words that are in my mind while writing a post, but sometimes I hate them when what I really want are categories. I read an article the other day by a graphic artist who designs for the web who continued to use the web safe palette long after it was not technically necessary. He argued that artists tend to choose colors from a comprehensible and memorable palette of colors, such as the Pantone set or the set of colors defined by the various oil pigments. With 16 million colors there are far more colors than anyone could recall or discern. For every "olive green" there are hundreds of colors in between that and the next discernible color moving in either direction on the color wheel. It helps to have a standard color when envisioning or communicating "olive green" to others. I think tags are afflicted with this problem.

Labels: , , ,

Simplicity and Community

Community: From Little Things, Big Things Grow is a really good overview of how community grew on Flickr and some of the philosophy informing how social community works.

At Flickr, we’ve worked very hard to remain neutral while our members jostle and collide and talk and whisper to each other. Sharing photos is practically a side-effect. Our members have thrilled and challenged us—not just with their beautiful photography, but by showing us how to use our infrastructure in ways we could have never imagined.

This is the same principle that operated when the web was born. It was simple, open and flexible enough that people could put it to unintended uses. It wasn't overdesigned. The net itself enabled people who "shouldn't" or "wouldn't" want to connect to find each other. It enabled people to find information they "shouldn't" need or want to find it. It enabled people to find, and share, what was important to them.

As I just wrote, the content, the pictures, the things we share on a site like facebook have little to do with the success of a social utility, they have everything to do with keeping up with your friends, which involves photos, but it is people, keeping up with what friends are doing, whether gardening or photographing, engaging in activities, like who can create the best compost heap or who has the best fashion photograph, that sustain.

The sculpture demonstrated a fascinating idea: given fewer rules, people actually behaved in more creative, co-operative, and collaborative (or competitive, as the case may be) ways.

It should not be surprising, given that HTML was a simplification of rule heavy SGML. Given fewer rules, anyone could make web pages and share them. Every time the network or web has grown, information technology has grown, it has been through a simplifying moment. It is also why the Wiki has touched such a nerve online and been very inspiring to what became called "Web 2.0" applications. It reminds me of a cool new online note taking tool Luminotes. I find its overall simplicity refreshing (for example, its simplified set of text markup options set off in oversize buttons and the brilliant recasting of the one-page-at-a-time-wiki into a scrollable set of note cards). Is the ideal website a tabula rasa like wiki, like a blank page available to users without any structure? I doubt it. Since that would just be a whiteboard or "graffiti wall" there has to be some simple rules aimed at organizing the activity toward some basic interests, as Flickr does.

It is true, corporations think they can "add community" like adding new delivery routes or buying an aircraft to open up a new route. You don't add community, you grow it. At farmfoody.org, we have to keep lines of communication open to independent farmers, many of whom have a low opinion of the usefulness of anything online. It takes a lot of time, commitment and personal touch to grow this kind of community. You have to show why getting online is important, and be ready to answer the inevitable questions.

Labels: , ,

SpeakUp: A Transcript Markup Language

What is SpeakUp?

A simple text markup language for transcripts of moving pictures or video including a markup language for annotation.

Overview

When the Folkstreams project required a way for filmmakers and academic contributors to create and maintain transcripts for films archived and presented through the Folkstreams website, I decided a simple text markup language would be the best way to store and edit transcripts.

A transcript markup language defines a series of conventions for formatting text (like wiki text) that is translated into HTML for display. SpeakUp was designed to contain as much content as possible and preserve meaning for possible later conversion into XML or database form.

Speakup is implemented as a module extending the PEAR Text_Wiki library text translation module and is a requirement for use.

Although development and documentation of Speakup is not complete, it is in use on the Folkstreams website.

Speakup, including all markup, code and documentation is open source and released under a GPL license. I apologize for the brevity of this document, but the best way to learn SpeakUp is to download the package and experiment with it. Download.

Some Background

Some background on why transcripts are important. As the Folkstreams project was developed, project director Tom Davenport and developer Steve Knoblock, in a series of discussions, arrived at the conclusion that transcripts are essential to searching, finding and understanding films online. Two points emerged: that transcripts are a rich source of indexable text that help make media searchable and that more importantly, transcripts are a rich source of conversation and debate.

Frequently notes are more informative and interesting than the work they annotate. We discovered this was true for film transcripts (see Sadobabies for an example of a conversation going on in the notes about the nature of folklore). Although there are sophisticated means to capture the dialog of a moving picture and render it to text, these transcripts are inadequate. They lack annotation. They lack expressive quality of a transcript edited by a knowledgeable person. They are in a sense, a travesty, like an OCR'ed copy of Dickens left uncorrected.

Labels: , , , ,

Open Flash Charts

I recently discovered a wonderful new open source project for creating Flash charts. It is open source, non-proprietary and best of all for a non-profit on a tight budge, it is free. In the last week I deployed Open Flash Charts after integrating the package into our Folkstreams content management system. For users of our system (through their personalized area My Folkstreams), this will be a great improvement in the quality of charts. We make the statistics on visitors and video views available to filmmakers, and the Flash charts are simply beautiful compared to our old ones based on phplot.

You can download the code for OFC (Open Flash Charts) from their homepage. It is the work of John Glazebrook and he must be a designer, because the default charts in the tutorial are beautiful and take advantage of the interactive features of Flash. I discovered a few kinks that need working out, but overall, this is an excellent addition to the open source code making up the Folkstreams platform.

Labels: , , , , ,

The Swicki: Collaborative Search

I've long thought that the fatal flaw of what library science calls "finding aids" is that they only organize information according to how it relates to other information. What I've always wanted is a search that relates information to what I care about, to my interests, to me. I've thought about "personalized searches" but the trouble with this approach is that it is time consuming to express to a computer exactly what it is you want. You must set up some kind of criteria and the the search returns results for you based upon it, such as the simple eBay search notification. If lots of people are going to use this with the efficiency they now get from Google, something else is needed. We don't have expert systems and artificial intelligence yet, so what is a possible solution?

Some are experimenting with attention. By tracking what you look at online, a profile of your interests can be built, which can then drive a personal search engine. People are really bad at expressing what they really want (as product developers and marketers have discovered) so the non-intrusive method of observing behavior may just work.

Another experiment takes a different approach. Why not let other people help refine your search? That is what the people behind Swicki seem to have done. If you could gather like minded individuals into one location where they could influence the accuracy of the search, the "search in a can" model could be improved. It goes farther than that. By becoming an aggregator of search results the system can ride on top of the web and use it as a database in a way similar to Yahoo Pipes. The most revolutionary aspect of Swicki is user created search engines. Instead of needing millions of dollars and massive servers, Swicki piggy backs on existing search results to enable anyone to create a web search engine. This kind of democratizing is a defining quality of web two point oh applications.

I see how the canned search model can be turned inside out, by allowing a group of users to collaboratively refine the canned search to improve it. Instead of empowering the computer to be smarter, it empowers people to create a smarter resource. It definitely becomes a kind of search-wiki. It competes in some ways with the idea of folksonomy. We have now user created taxonomies and now user created searches. What I like about both developments is how it democratizes the organization and finding of information. The folksonomy enables people to create their own vocabularies, perhaps multiple vocabularies for the same subject area. The wiki search enables people to create alternative search results for the same subject. My background is in a subject where unresolvable disagreement is commonplace. It's called genealogy, where there are no facts, only interpretations and sometimes two families claim the same individual. This is not something for concern in genealogy and I like the way more than one truth can exist within the same framework, it's much better than declaring one view right and all others wrong and working hard to keep your opponent's views out of view. Despite what some may claim, there can be more than one version of the truth. Let an idea gain mindshare on its own merits.

I've thought before about a search engine where you could search the web by creating predefined searches, but never thought of letting everyone edit your predefined searches, that is novel, just as with social bookmarking

You can see the Swicki I created in a few minutes in the sidebar of this site (as long as it's there).

Labels: , , , , ,

A Good Introduction to Yahoo Pipes

A good introduction to what Yahoo Pipes are and the possibilities they offer posted to Read/WriteWeb. Also, a simple, clear, concise explanation of emergence:

It is this automagical process through which elements of a system give rise to a higher order system. Emergence is how physics becomes chemistry and chemistry becomes biology.

Labels: , , , ,

Quick-slow: A way to give meaning to media?

I develop the platform for the folkstreams.net website, which is a non-profit archive for rare folklore documentary films. We transfer the films to video and then stream them to the web so they are not lost, molding in some archive never to be seen. Many of our films have not been seen for twenty years or more, one was rescued from a barn. As such, we are strong advocates of open access to archives (and I am happy to learn many other institutions in the folklore world also understand how important access is to a sustainable archive and are using the web in wonderful ways).

To the point. It has always been important for our films to be presented in context. I have always believed that media without context is meaningless, whether that is a family photograph or a documentary film. A photographic image is merely an interesting composition without the information necessary to understand it, to interpret it. All images must be read...oddly enough, since they are the seeming opposite of text, which everyone acknowledges must be read. We teach literacy, but we don't teach the equivalent for images. (What would that be imagacy, photoacy, videocy? That last one sounds too much like idiocy to be comfortable.). A photographic image may affect us as a work of art, or it may present an attractive composition, but beyond that it requires context. The same is true for moving pictures. This is why each film on our website is nestled in a set of contextual materials. I sometimes doubt that many people read them, or read the transcript, but they are there to give meaning to the films, to place them in context so that people may better understand the subjects and ideas presented in the film.

I've wanted for some to build a small content management system where

* Media is on an equal footing with text, but also where media is the centerpiece.
* Media is as easy to work with and place in context as working with text in a wiki.

There is a form, a mashup you might call it, between a Blog and a Wiki, called a Bliki. I had not paid too much attention to this development until I read on one of the advocate's websites that the idea of a Bliki involved a "quick-slow" process. The blog enables a user to quickly write a blog entry, something quick, potentially ephemeral and tied to time; At that moment or any time later, the user may also create a wiki page connected to the blog entry, for slower moving, more thoughtful and persistent content.

I think this would apply perfectly to such a media application, which would be useful in personal publishing and could help small archives (local history and genealogy societies, libraries and archives) manage and create access to their image collections. As volunteers scan images, they could upload them with descriptions as blog entries, then they or others could provide context through wiki pages associated with the image meta data in the blog entries. It would be upload, give a title and description, later come back, drop a wikiname in the description and then create a page. It could encourage local people to contribute memories to photographs, for example.

Today, I actually came across the first example in the wild of someone doing this, someone using the Bliki "quick-slow" philosophy to give meaning to images. "I added Wikipedia links to my Flickr photos..." (http://instones.org/archives/61 2007) This is exactly the kind of behavior I would like a web application to enable and encourage, which would facilitate quick image upload and meta data, but would also enable and encourage placing media in context, as well as supporting tagging for folksonomy. It's more a philosophy than a technology.

Labels: , , , , , , ,

Yahoo Pipes

I made my first pipe today at Yahoo Pipes, which takes our Folkstreams recently additions feed and finds Flickr images that correspond to the folkloric subject categories associated with each entry, rolls them into the feed. I'm not sure if I constructed it properly of if it's useful, but it works, the Flickr images do show up in an element of the RSS item for each film.

It's a fascinating concept, although a feed mashup is not a new concept, Pipes is very broad, powerful and slick. It borrows from several novel websites, shades of Ning where users build technology using Lego-brick-like software components. Of course, it's like Unix, a set of simple tools with standard input and output that can be endlessly combined into new, useful tools. It also draws upon open source software development, since others can see how you constructed a pipe, copy it and make it their own to study or build their own tools on.

Pipes is a social network version of a RSS feed mashup service. The idea of mashing up RSS feeds to create a new feed, combined from other filtered feeds is nothing new. But sharing the feeds and the construction of feeds with other users in a social network environment like social bookmarking, etc. is novel. You can do the normal social networking things, aggregate most popular pipes. Any kind of content item can be social networked and then the aggregates derived from that usage data.

It definitely has shades of open source since you can see how any pipe is made; shades of wiki in that you anyone can study, edit and build the application from within the application through a language. It is similar to emacs or even Blender (which exposes its interface and engine to Python, so new interface elements and functionality can be added by users). Users becoming software builders is a growing phenomena, just as wiki showed readers could become authors, all part of a general trend toward what is called professional-amateurism.

Labels: , , , , ,