Friday, January 15, 2010

Welcome community plumbers (aka Drupal)

Towards the end of last year we loaded up the Drupal archives and subscribed to their active lists. Drupal is a pretty successful project with lots of real-world use and we figured they might be reasonably chatty online.

As preparation for this post, I checked out the Drupal front page. I love a good analogy and their HTML <title> but a good smile on my face:
    <title>drupal.org | Community plumbing</title>
It made me think we might re-consider the MarkMail title, given what we do. MarkMail helps organize development communities and so my first thought was:
    <title>markmail.org | Community organization</title>
More specifically, we help with project histories so I changed it to:
    <title>markmail.org | Community histories</title>
And then, perhaps best of all, I landed on:
    <title>markmail.org | Community libraries</title> 
And as the chart above shows, these folks are pretty chatty and have been so since about 2004.

So here's a hearty welcome from us at MarkMail

Librarian

to the busy folks over at Drupal

Plumber

Wednesday, January 13, 2010

The wide world of Ubuntu: 1.9 million messages


A few months ago, we loaded up the publicly available Ubuntu archives and subscribed to their active lists. MarkMail now searches 313 Ubuntu lists and 1,917,153 messages. The first Ubuntu list started in July 2004 and there are currently 200 active lists, recently accumulating 4,061 messages per day.
Ubuntu is very much a world-wide project. Below is a recent snapshot of the message counts for Ubuntu lists associated with specific counties:

Agentina26,923
Austria5,597
Brazil67,940
Columbia20,758
Germany19,438
Spain41,489
Italy43,618
Nicaragua11,761
Russia19,321
United Kingdom21,995

Sunday, November 8, 2009

Magic Eraser Policy

If you've been around email long enough, you know there's a reason that gmail has that magic undo send feature. (Well, ok, once you see what it is, it's not that magic).

Your permanent recordBecause of the nature of digital information, it's too easy to have that email end up somewhere you'd rather not see it. Once you hit send, you don't have much control over the copying process. That email immediately becomes part of your permanent record. And, well, you've probably been there yourself at one time or another, wishing you hadn't sent it.

Even though there are a ton of sources on email etiquette. it's not too surprising that folks send email they wish they hadn't. We get weekly evidence of that. Like search engines and other popular archives, we get our share of requests to remove content. Sometimes folks ask for individual emails to be removed. Other times, we get more open-ended requests like, "please remove my name from your site".

The funny thing is that mailing list tools aren't really set up for redaction. They don't make it easy to remove individual emails, let alone parts of emails, from their archives. So, what does MarkMail do with this issue?

Magic Eraser

MarkLogic Server to the rescue!

One of the cool things for us, is that MarkMail actually has a magic eraser. With little pain, thanks to the real-time index provided by our MarkLogic server, we can remove emails from our index at a moment's notice. With a single line of XQuery code, we can move the document representing that email into a hidden collection and in nanoseconds, it's gone from all further queries to MarkMail. Yes, that is neato.

So we have the technology.

Still, we don't take message removal lightly. As an authoritative record of public history, removals are treated as the exceptional case. MarkMail provides content that it receives from publicly available sources. Everything we serve, we received another source. List administrators control their own lists and set policies on archives and we respect that. By posting on their list, you follow their rules, and we do too. So, if you want something removed from MarkMail, you'll usually have to get it removed from the original source first.

We've recently added our removal policy to the site, which has full details on how we deal with removal requests. The mechanism starts at our feedback page. In a nutshell, we will remove emails under two specific cases:
  1. Content in Violation of the MarkMail Content Policy (e.g. spam, porn, virus, fraud, illegal activities, copyright violations)
  2. Content Removed from Official Archive
When content clearly violates our content policy, we may simply remove it. No surprise there. That's bad stuff. Any other material needs to be removed from the original source before we'll act. In all cases, we prefer requests from the archive owner and/or requests that come with evidence that the content has been removed from the originating source.

This policy attempts to balance the rights of many parties, including those who have posted content to public lists and forums, those who own and administer the public lists and forums, and those who link to and reference the MarkMail archives as an information source and accurate record of history. We hope you find it reasonable.

Wednesday, November 4, 2009

Easy Change

As we look to make MarkMail pay for itself, it's pretty darn obvious that the traffic the site gets is sufficient to generate some revenue from advertising. A good number of well-known developer-centric sites display ads (e.g., sourceforge.net, www.xml.com, vim.org, linux.org, many others) with varying degrees of success. It even looks like msdn.com displays ads (albeit for other MS properties). And there are also sites like Expert Exchange that charge users a premium to search their archives, while also displaying ads.

The plan is to make minor adjustments to the site layout and provide relevant ads that are meaningful to a software developer. Over time, we'll be looking to bring in ad content that is

  • targeted to the list archive the developer is reading
  • suitable for a library (quiet, text-only, unobtrusive)
  • designed for and targeted to a developer audience
We'll be enabling Google AdSense text-based ads soon, as they appear to be best-of-breed, simple to implement. But we'll also be looking to sell advertising space ourselves.

In the meantime, before we're fully set up, if you're reading this and you have something worth communicating to some of the millions of folks that end up at MarkMail, give us a holler.

PS. If you sign up for a MarkMail account, you'll have access to a switch that will enable you to opt out of the AdSense ads.

Thursday, October 8, 2009

The New Guy

Hi Folks,

I'm the new guy and this is my inaugural post.

I first ran into MarkMail a few years ago, during my tenure at Clearwell Systems. Back then, both Clearwell and MarkMail were building "search engines for email". If you didn't look deeply, you'd have thought we were competitors. But we weren't. And we aren't, still.

At Clearwell, we knew we were breaking ground. Like many start-ups, we really didn't preconceive which applications our efforts would enable and which markets would find our solutions valuable. We just believed, with all our hearts and minds, that there was something there and we could get it done. I've since learned the history isn't too different at MarkMail. To make a long story short, Clearwell went one way, focusing on enterprise email environments, like Microsoft Exchange, PST Files, and Lotus Notes, eventually cracking the nut on common Electronic Discovery use cases. We built a really sweet e-discovery product that continues to get rave reviews and save customers $$$.

And, MarkMail went another, supporting public, open-source communities and mailing lists (MailMan, Ezmlm, Google Groups, and others). Through these efforts, MarkMail has become a large scale, highly respected, and high-traffic service for software developers. Ironically enough, as a software developer, it was not uncommon for me at Clearwell to ultimately use MarkMail. So I've been a fan for quite sometime.

And now, thanks to the hard work of good folks at Mark Logic (and Jason Hunter in particular), I'm here to help further the MarkMail mission. I bring to MarkMail, deep experience in software development practice, high-performance computing, and user interface engineering (OpenLaszlo, LaszloMail). For a good percentage of this time, I've been in and around all sorts of email, communication, and collaboration tools, especially those used by developers. And in my most recent gigs, I've been focused on making sure my engineering efforts are part of a broader business success. And I plan to do the same for MarkMail.

So... to take a line from Bette Middler, "Enough about me." What am I going to do for MarkMail? Well, a lot I hope. In the immediate term, I've got a few obvious directives, like keeping the site going and growing. I’ll also be focusing on the changes needed to make the site responsible (aka pay) for its own operations, while keeping a focus on the general communities that it serves. To that end, you can expect another post (or two) about upcoming changes.

Sounds like fun, eh? Well, you'll get to hear it all, assuming I continue to be able to make time for blog posts like this one. Thanks for listening and please holler at me with advice and comments. And, of course, if you have feedback or ideas related to how MarkMail might help you, please holler.

Oh... and one more thing. If you're in the Bay Area and you want to see me in person, you can also find me occasionally playing out with the Tribal Blues Band (I'm the one in the blue shirt).

Tuesday, June 23, 2009

MarkMail at the first MarkLogic User Group

Last week I spoke at the inaugural Mark Logic User Group meeting in Reston, VA (near where a lot of our government customers are based). The topic was MarkMail: where the idea came from, how we built it on the cheap, how Mark Logic began using it internally, and some lessons we learned as we scaled out the public high-traffic site. It's a similar talk to the one I gave at the Mark Logic User Conference in San Francisco last month.

For those interested, the slides are available as a downloadable MOV file. Click to advance.


The slides are fairly simple. Most of the fun of the talk (well, at least for me) is in the stories I tell, usually relating to the quotes in italics at the bottom of slides. I suppose you'll just have to use your imagination.

Tuesday, May 5, 2009

MarkMail at the Mark Logic User Conference

The Mark Logic User Conference is coming up next week. If you're coming to the show, I encourage you to attend the talk on MarkMail I'll be giving on Wednesday. I'll tell the story of MarkMail as it progressed from my first idea to a night project built with Ryan Grimm to the robust web site you see now at markmail.org (and even to the other web sites you don't see, because they're running behind people's firewalls). It's in the conference's technical track so there'll be a lot of focus on the core tech.

If you're not coming to the show, why the heck not? :) It's not too late to register.