Gross Archive

The Ubiquitous Project Gutenberg - Interview With Michael Hart, Its Founder


November 15, 2005
Michael Hart conceived of electronic books (e-books) back in 1971. Most pundits agree that in the history of knowledge and scholarship, e-books are as important as the Gutenberg press, invented five centuries ago. Many would say that they constitute a far larger quantum leap. As opposed to their print equivalents, e-books are public goods: cost close to nothing to produce, replicate, and disseminate. Anyone with access to minimal technology or even the oldest computers can read e-books.
Hart established Project Gutenberg - a repository of tens of thousands of public domain texts, freely available online. It is the largest and most comprehensive of its kind and has spawned numerous imitators, emulators, and mirror site. E-books became a mainstream item with giant commercial enterprises - from Microsoft through Yahoo and Amazon to Google - entering the fray.
"Now that e-books are becoming mainstream, the giant commercial enterprises such as Google, Yahoo, Microsoft, Amazon and Random House are attempting to co-opt the e-book world from its 'Unlimited distribution' origin to the old
'Limited Distribution' paradigm of the common business plan." - says Hart.
The Industry
1. As the man who pioneered e-books, how do you feel about Google Print Library, MSN Book Search, Wikibooks, and Yahoo's Open Content Alliance? Do you feel vindicated - or unjustly ignored?
A. Actually both, and quite thoroughly in both cases.
Each time an organization claims to have invented eBooks or eLibraries, I feel both vindicated and ignored, not that either one of these is new.
However, vindication, for me, comes from the bottom up, not the top down.
Project Gutenberg is the perfect example of a "grass roots" opposite of "The Trickle-Down Theories" that run the world today. We are truly, "Of the people, by the people, for the people."
We are truly a "Trickle-UP" project, which has been virtually ignored simply because of those who follow the first rule of reporting:
"Follow The Money" can never follow Project Gutenberg, since we've never had any money whatsoever.
However, if we DID get just a penny for every one of the trillion plus eBooks we have given away, based on reaching just 1% of the population, we would have enough to buy out Donald Trump, and the press would beat on our doors to give us coverage.
Still, it is MUCH more important to show that Project Gutenberg has changed the world. . .without money. . .without being co-opted by the Big Boys, simply by continuing to do this job for 35 years.
Today you can get over 50,000 titles from the Project Gutenberg sites, with no hassle, no money, no cookies, many even with no Internet. (via SneakerNet - This is when you put on your sneakers to run across the street with a CD-ROM or DVD.
Our target audience is the person on the street, not the ivory tower scholars, who all want to take over how our books should look, and not the corporations, who will only want to take over in the same way they took over music downloads only AFTER they proved to be successful.
Most business plans target the 1% of the population that is most geared towards their product, and this is why they consider a "million seller" to be a great success, while Project Gutenberg targets 100 million as a reasonable success.
Most business plans are elitist by their very nature, as they target an extremely small portion of the population. Project Gutenberg was a new business plan, targeting virtually everyone and it has proven to be the most successful plan of how to use the Internet.
Google?
Google made lots of predictions and promises: "Today is the day the world changes,"
But as of now, Google hadn't really even gotten started, with only 3 downloadable eBooks we could find. However, in response to Yahoo's Open Content Alliance, Google had to finally start releasing books, over 300 days into their project.
I would have LOVED to see Google put up 10,000 eBooks per week in the 10 months since their zillion dollar media blitz last December 14th (2004). They would be now approaching their 500,000th eBook, and Project Gutenberg would be working on ways to distribute them even more widely, do more proofreading, more formats, and all the other things that would keep the ball rolling.
Yahoo?
Sad to say, the media, once burned twice shy, seem to have pretty much ignored The Alliance. . .and Brewster Kahle, whom I KNOW could do more than Google has done, has ignored my requests for any information, so I can't tell you anything more than you've already heard.
Obviously, the real test of any such effort is not in the first 10 months, but perhaps in the last 10 months.
It would be GREAT to see the "10 million eBooks drive" end with 10 months in which millions of eBooks were created and put freely online but right now we have to wait to see how they do with the first few percent.
I feel a need to quote myself here, something I said on July 4th, 1971, when I first invented eBooks and thought about the repercussions:
"You will be able to hold the Library of Congress in one hand, but I am sure they will stop us from being able to do that."
(Said at The Materials Research Lab, University of Illinois, in the Xerox Sigma V computer room)
2. How do you feel about e-book piracy? Is it partly a reaction to overly onerous copyright laws? Does PG work with intellectual property lawyers?
A. I used to mention in my emails that there were thousands of "Pirates' Coves" online, but not one of them did eBooks, and that we would know when eBooks had finally "made it" when such things came into existence, just as the sales of the first million selling book, Uncle Tom's Cabin, were largely due to pirated editions. Anyone who says the publishers'
history doesn't include piracy, just isn't looking. Pirated editions of Uncle Tom did the same for the publishing industry
as Napster did for the music download industry.
As for the book industry, the news media is constantly filled with stories about how a gallon of gas that was $.25 in 1955 has gone up 10 times over, but the price of paperbacks that were $.25 in 1955 is now $10, not $2.50, about 40 times as much, yet this is never mentioned. I can only remark here that readers of these books have been victims of price increases four times as much as drivers. Yet, you never once have seen a news story about the high prices book stores are charging as compared to gas stations, have you?
So, some piracy indeed has to do with the price of books spiraling out of control and out of the reach of many readers.
The real question is: "Who is the victim of piracy here?"
Is it the publishers, who have spent a billion dollars to make you think so - or the public, who is paying 4 times as much for paperback books as for gas, when they were the same price when paperbacks first came out?
Obviously, as I mentioned above, some piracy has to do with translations that aren't available to the public, and some of it thrives in places where no legitimate copies are available at all. I have been to locations in Asia and Europe where the publishers simply don't care to sell - no matter how hard one looked - and then they complain that someone is making copies.
In the US is it legal for libraries to make copies for patrons when there is no copy readily available, either due to being hard to get or because the price is too high. I presume this might also be the case in other countries.
I think you will find this in Section 108 of the US Copyright Law.
Obviously when materials are not readily available, people might be expected to take things into their own hands as the law above obviously provides for.
As far as our "legal eagles" go, each Project Gutenberg has guidelines to stay within their local copyright. The really hard copyright research is handed over to our legal experts.
3. Do you think that Project Gutenberg - the largest online repository of public domain and copyrighted books - threatens the publishing industry's and media conglomerates' vested interests? PG is now distributed on DVD as well. Can this be construed as an incursion into traditional publishers' turf? Is disintermediation on the cards - the blurring of lines between author, publisher, and reader?
A. The publishers view any competition as an incursion on their turf, particularly the expiration of any copyrights, whether the books were still in print or not.
The publishers want to be the ONLY source of information, and to make it available on a "pay per" basis, so the greatest effect of these copyright extensions is not to have MORE books in bookstores, but FEWER, as that new copyright law prevents us from having public domain editions from the millions of books covered by the new copyright terms.
If people knew the copyright laws were being manipulated each time a new technology comes along that COULD actually bring the public domain to the masses, then they probably would say or do something about it. But copyright laws are enacted quietly and behind smokescreens. The US Copyright Act of 1998 was passed in the same 24 hours as President Clinton was impeached, and behind closed doors - I tried to testify - with a voice vote only so there would be no voting record. Thus, a common person would never have heard about it. Even I, who was trying to go testify, didn't learn about it for three weeks.
Every time a new technology was invented that would stop the publishers' monopoly, copyright laws were enacted to stifle it. After all, the first copyright was simply reactionary political maneuvering by The Stationers' Guild to get their monopoly over the written word back, and the same reactionary politics caused the US Copyright Acts to counteract steam printing presses, electric printing presses, the Xerox machine, and now the Internet.
US Copyright Acts were enacted:
1831 to stop the first high speed steam printing press of 1830 and because the first 28 year copyrights from the 1790 Copyright Act were starting to expire. Heaven forbid a copyright should expire!
1998 to stop public domain from flowing through the Internet
1976 to stop public domain from flowing through xeroxes
1909 to stop electric presses from reprinting public domain works, etc.
Every time WE could copy the public domain, they extend copyright time after time after time.
It couldn't BE any more obvious, except that the media won't say anything to us about copyright, so how could we know. . .it's not taught.
We have been threatened with a number of lawsuits, mostly by lawyers who seem to know very little about copyright. After we explained what we are doing, under which laws, it turns out they were just "blowing smoke" at us, trying to make us honor rights they don't have, with any legal explanation of what law[s] would give them rights over the material in question.
We're thinking of starting the OED, Oxford English Dictionary, and we expect more smoke from them, since they reacted this way at our initial announcement of this years ago, and threatened us when we posted "The Oxford Book Of English Verse," but they went away after getting me called on the carpet by a local University of Illinois Chancellor who happened to be Tom Cruise's uncle, and so worth the visit. By the way, this fellow was so Luddite he said he would quit the day he had to use e-mail.
We don't have any affiliation with the UI, but Oxford was going to use all the muscle they could muster, we'll see what they do when we do our first OED posting.
Regarding the DVD, anything that is free can be said to threaten that which is not free, just as anything that is not free can be said to threaten that which is free. If you study the history of copyright, this will become quite obvious.
As for disintermediation, it has been there all along. If you have computers you can be a publisher, an author, a reader. . .with a potential audience larger than any paper medium. Recently this has been exemplified by the first million selling music download by Gwen Stefani. . .. Just think what is going to happen when we have our first million selling music download that isn't run through a major music label! Think it can't happen? Just watch, and remember what happened after Dido's initial CD flopped with no push from her label: it became a multimillion seller after it was sampled in that famous music video. Not to mention that Lisa Loeb had a million seller on CD without ever being signed to a label. The day is coming when artists, musicians, authors, and other artists will be free from the contracts of the publishing industry that give them $50,000 out of each million dollars in sales.
4. Books are now being read on more platforms than ever - PDAs, iPods, cell phones, and even Sony Play Stations. How does this affect the very definition of the book? In other words, what is the future of the book in terms of format?
A. My own view has always been to support as many ways to read eBooks as possible, so this doesn't change anything about my definition of eBooks.
I LOVE it when I get an email from someone reading a PG eBook in Urdu on a cell phone in the Serengeti Plain in Africa!
THAT is what PG is all about!!!
We just added our 47th language at http://www.gutenberg.org and we have 104 languages at http://www.gutenberg.cc and
65 languages at http://pge.rastko.net (Project Gutenberg Europe), and are coming up on 500 eBooks at PG of Australia.
(BTW, that 47th language above, is from New South Wales, Australia!). I can't wait until I get an email from someone reading in Kamilaroi!
Regarding the various formats and platforms:
We are working on a system to create the eBooks in an XMLish format that can be converted into dozens of other formats, on the fly, so that anyone can instantly get any of our eBooks in any popular format.
Usually there are Project Gutenberg eBooks available for any new platform, such as the iPod, only a week after it comes out. We can't take credit for this, our readers and volunteers are the ones to come up with these instant versions, and who come up with nearly everything Project Gutenberg does.
My own contribution is now mainly to hold things together, to make eBooks really take off, to make sure everyone can get tens of thousands of free eBooks, someday tens of millions.
5. What is the most important thing you have learned in the 35 years that you have spent considering the world of eBooks?
I would have to say the most important thing I learned in the past 35 years of thinking about eBooks is that the underlying
philosophy since time immemorial is:
"It is better if I have it, and YOU do NOT have it."
The Philosophy of Limited Distribution.
The primary quality of eBooks is that everyone can have them.
The philosophy of Unlimited Distribution.
This is perhaps the largest paradigm shift possible in world thought, shifting from the ideal that all things can be
had only in limited supply ("supply side economics") to a new ideal that things can be produced such that everyone can have
all they want.
With eBooks, everyone can have all they want without any effort to limit what other people can have. Before eBooks this was only possible with the air supply.
The real question is going to become more and more obvious as we move closer and closer to the technology of The Star Trek Replicator.
What will happen when EVERYone CAN have everyTHING???
Will they pass laws against that, too???
6. What should be the role of government in all this?
We have had governments for ages that have SAID they would be delighted to feed, clothe, house, and educate the world, if it were not so expensive.
Yet for 35 years no government has taken the steps to provide an electronic public library for the people. Add to this the
number of academic institutions, cities, states, and nations, as well as charities, and you begin to realize that eBooks in
some sense are being ignored by thousands of institutions who SAY their interest lies in providing for the masses.
We have been capable of bringing every word ever written to a wider audience than ever before for years, but the truth is a
movement to deny access to this information has been underway for even longer in the form of continuous copyright extensions.
The prime example, obviously the one I name my own work after when I started Project Gutenberg, is The Gutenberg Press. Before Gutenberg the average book cost as much as the average family farm, and thus was out of the question for the average person on the street, much less for the even more persons who lived in places that didn't have any paved streets. Books were virtually inaccessible before The Gutenberg Press, other than to the elite of wealth, education, and religion.
Not only were books inaccessible to the person on the street, but even if they could manage to get a book the vast majority
couldn't even come close to reading it. This provided a great wall insulating Haves from Haves-not. The Haves could read, the Haves-not could not read, and the advantage to the Haves is incalculable.
If you look at the attitudes toward Unlimited Distribution of eBooks you will find that the primary motivation here is wall
preservation: preserving Haves and Haves-not as classes in a time when billions could have every word ever written.
There have been well over a billion computers made. There have been one billion cell phones added since the beginning of last year, and a another billion, or more, may be made before the end of next year, and each is going to be capable to serve an eBook reader. And this does not include millions of PDAs, iPods, etc., much less millions of game consoles that can be used for eBooks.
The truth is that there have been enough eBook capable devices made that everyone who can read could have one and still some would be left over.
At the time of The Gutenberg Press, hardly anyone could read, and yet it would have been impossible to deliver one copy for each of them, of whatever your favorite book was. But AFTER Gutenberg the number of books printed each year was
greater than the population of the places that made them. Books, and thus literacy, had finally come to the masses.
However, this did not appeal to those who had previously held monopoly power over all publication: The Stationers Company.
By the time The Gutenberg Press had gotten a strong foothold, publishing millions of books per year, The Stationers had bid
for new laws to make all publication, other than their own, a violation of the law.
They did this in two very powerful ways:
1. Everyone else's printing presses were declared illegal.
2. A "copyright" patent was granted The Stationers, to "own" the only license for publication of all words ever written.
The first few attempts at such laws were met with such hatred that they were never enforced, and finally were withdrawn.
However, after over 150 years of trying to convince dozens of courtiers and monarchs, and failing, "The Stationers Company" was finally granted a royal patent, and became the only legal operators of the dreaded Gutenberg Press that had ruined such monopoly powers they had had since the dawn of time
Project Gutenberg
7. Is PG self-financing? Does it rely on donations? Does it receive any support or sponsorship from publishers and authors?
A. We don't really deal with money all that much or with financing as most people see it. We are nearly all volunteers, so there is very little in the way of finances. We rely more on donations of time and energy than on donations of money.
I, myself, haven't received my monthly paycheck for about 2.5 years.
We don't receive any corporate sponsorship, or the various grants you hear about for making digital libraries.
In fact, just this week, I received a copy of a small magazine about eBooks that mentioned a conference of some 30 eBook makers, but did not mention Project Gutenberg at all. Interestingly enough, they included a poster of a few dozen logos of eBook makers, and it appears they cut off the poster exactly where the words "Project Gutenberg" were in our own logo.
They TALK about global information sharing, but they are really a collection of insiders doing insider things, and they are not really interested in getting eBooks to the common person , but rather mostly to those who are well-read and being well-educated already. In this sense, I agree with those who say there is still a great deal of Digital Divide.
However, we aren't going to go under, either, as they always say we will. Those who are used to living with no money, don't depend on it.
8. What are the legal and operational relationships between PG, PG Australia, PG Europe, and Distributed Proofreaders? How does PG collaborate and fit in with P2P file sharing networks such as BitTorrent?
A. There are no legal or operational relationships that I know of, we don't even email each other very often. . .not for months at a time. Project Gutenberg is only registered as a trademark in the US and, as far as I know, we have no legal control over
it in other countries, though the other Project Gutenberg efforts have been mostly very nice about using it the same way we do.
Regarding P2P networks, we pretty much allow anyone to do filesharing with our eBooks, as long as they aren't charging anything. . .it's not something specific to BitTorrent or any specific system. We do happen to run both BitTorrent and provide MagnetLinks (p2p) ourselves, but we're open to essentially any file sharing. Although we have a rather lengthy trademark licensing policy, it allows essentially any non-commercial use, including p2p and other filesharing methods.
9. What is the future role of machine translation in PG and other e-text databases?
A. This is perhaps the most important question you have raised, other than the issue that copyright will become permanent, and then we won't have any more public domain entries to work with.
My personal prediction is that when we have 10 million eBooks online, MT (Machine Translation) will be about where OCR (Optical Character Recognition) was when the world first started to become really aware of Project Gutenberg in 1989, some 16 years ago.
Then the next big project will be to translate those 10 million books into 100 different languages, so we will have a billion books to send to a billion potential readers. . . . For those who love big numbers, that's a QUADRILLION books given away.
10 million titles in 100 languages = 1 billion books
1 billion books to 1 billion people = 1 quadrillion books
10. What are you, at PG, planning for your 35th anniversary on July 4, 2006?
We have only 7 months left to the 35th Anniversary of Project Gutenberg. If you have any particular ideals or ideas you would like to have included in these events surrounding July 4th, 2006, please let me know so I will be able to coordinate efforts to insure they will be all ready to go for a timely release and maximum dispersals to our various audiences.
These would hopefully include:
I. The 35th Anniversary Of Project Gutenberg
II. The 20,000th Title Added at http://www.gutenberg.org
III. The 50,000th Title Added at http://www.gutenberg.cc
IV. The 500th Title was just added at Project Gutenberg of Australia
V. The 500th Title Added at Project Gutenberg of Europe
VI. The xxth Title Added at Project Gutenberg of Canada
VII. The Grand Opening of Project Gutenberg of the Philippines
VIII. The Official Release of the first "Million Dollar DVD"
In closing, I would like to say that we stand now at the REAL Digital Divide. . .the choice between free copying, from a free public domain. . .and only commercial copying from commercial sources.
When everything is copyrighted, patented, trademarked, etc., what difference will it make if someone invents a Replicator,
if it is illegal to copy anything?
Will the copyright laws continue to be extended over and over and over and over again?
Or will there someday be a world in which the promise of new technology is not reined in, or reigned over, by an old
system designed to preserve the separation between the Haves and the Haves-not?

500
Leave a comment...