Cathedral and market
Eric raymond
Hansb translation
-------------------------------------------------- ------------------------------
I. Cathedral and Market
The impact of Linux is very huge. Even prior to 5 years, who can imagine a world-class operating system only with fine Internet connected to thousands of developers in the world to create a spare time?
I certainly won't think so. In early 1993 I started paying attention to Linux, I have participated in UNIX and free software for ten years. I am one of the earliest participants in the mid-1980s. I have released a large number of free software, develop and assist in developing several procedures that are still widely used (Nethack, Emacs VC, and GND mode, XLife, etc.). I think I know what to do.
Linux overthrew many things I think they understand. I have promoted the gospel of gadgets, rapid prototyping and evolutionary development. But I also believe that some important complex things need to be more concentrated, strict. I believe that most important software (operating system and like Emacs, the same real large tool) needs to be developed to build the Cathedral, need a group of wonders of the world, there is no beta release before success.
The development style of Linus Torvalds (do it as soon as possible, entrusting all the things that can be entrusted, and all the changes and integration open) are surprising. There is no quiet, devout cathedral construction - On the contrary, the Linux group looks like a huge market that has a variety of different agendits and methods (Linux archiving sites accepted anyone's suggestions and works, and smart Manage), a consistently stable system is like a miracle to be generated from this market.
This design style does work well, and it works very well. This fact is indeed an impact. During my research, I not only work hard in a single engineering, but also try to understand why the Linux world is not only differential in a chaotic, but in turn becomes more and more powerful with an unimaginable speed of the cathedral builders.
In 1996, I thought I started to understand. I have an excellent testing of my theory, in the form of a free software plan, I consciously used the market style. I did this, and I have achieved great success.
In the rest of this article, I will tell the story of this plan, I use it to clarify some of the highly efficient motto of free software. Not all of this is learned from the Linux world, but we will see what kind of position will the Linux world give them. If I am correct, they will make you understand what makes Linux groups a good source of software, helping you become more efficient.
II. Email must be passed
I have worked in a small free visit to the ISP of Chester County Interlink in a small free visit, which is located at the West Chester of Pennsylvania. (I assisted in establishing CCIL and wrote our unique multi-user BBS system - you can Telnet to Locke.ccil.org to detect. Today it supports three thousand users on the 19/2/2. This job allows me to be online through CCIL's 56k line a day, actually, it requires me to do it!
So, I am very familiar with Internet Email. Because of the complicated reasons, it is difficult to work in the machine (snark.thyrsus.com) and CCIL. Finally, I finally succeeded, but I found that I had to Telnet to Locke to check my email, this is really annoying. What I need is that my email is sent to SNARK, so Biff (1) will notify me when it arrives. Simply sending the transfer function of Sendmail is not enough, because SNARK is not always online and there is no static address. I need a program to pull my local mail sent by my SLIP connection. I know this thing exists, and most of them use a simple protocol POP (Post Office Protocol). Moreover, the Locke's BSD / OS operating system has come with a POP3 server.
I need a POP3 customer. So I went online to find one. In fact, I found three or four. I used POP-Perl, but it was less than a significant feature: extract the address of the received message to correctly reply.
The problem is this: assuming that a person named "Joe" is sent to me. If I get it to Snark, my mail program is pleased to send it to "Joe" on the Snark that does not exist. Handmade adds "@ ccil.org" on the address to a harsh pain.
This is clear that the computer is doing things for me. (In fact, dependent on the RFC1123 5.2.18, Sendmail should do this). But there is no existing POP customer know how to do it! So this gave us the first lesson:
1. Every good software work begins with the itch of the developer himself.
Perhaps this should be obvious ("need to be invented" has proven to be correct for a long time), but software developers often put their energy in their procedures they don't like, but in the world of Linux This is not this - this explains why the software produced from the Linux group is so high.
So, is I immediately put into crazy work? Do you want a new POP3 customer and existing competition? Take it! I carefully examined the POP tool on the hand and asked myself. "The closest thing to me?" Because:
2. Good programmer know what to write, great programmers know what to rewrite (and reuse).
I didn't claim that I was a great programmer, but I tried to follow them. An important feature of great programmers is constructive lazy. They know that you are because of grades rather than trying to get rewards, and it is always easy to get up from a good actual solution.
For example, Linux is not to write Linux from the beginning. Conversely it starts from the code and ideas of re-use Minix (a 386 model similar to UNIX micro-operating system). The last miniX code disappears or has been revoked, but when they are there, it makes the tupes for the prototypes that ended Linux.
Adhering to the same spirit, I am looking for good coded ready-made POP tools to be used as a basis.
The code sharing tradition in the UNIX world has always been a very friendly for the code (this is why the GNU is scheduled to choose it as the basic operating system regardless of the UNIX itself. The Linux world pushes this traditional to the technology limit: it has several t-bytes of source code available. So spending time in the Linux world, looking for almost good things, it will bring better results elsewhere.
This is also suitable for me. Plus I have previously discovered, I found 9 candidates - FetchPoP, Poptart, Get-Mail, GWPOP, PIMP, POP-Perl, POPC, POPMAIL, and UPOPs. I first selected "fetchpop". I joined the header redemption function and did some improvements in the 1.9 version of the author. But after a few weeks, I accidentally discovered the "PopClient" code written by Carl Harris, then found a problem, although fetchpop has some good original ideas (such as its daemon mode), it can only handle POP3, and The level of encoding is quite amateur (Seung-hong is a very smart but experienced programmer), Carl's code is better, quite professional and stable, but his procedures lack several important fairly easy-to-implement FetchPop features (including I wrote it myself).
Continue or change it? If you change one, as a price of a better development basis, I will throw away the code I already have.
One actual motivation is to support multi-protocols, POP3 is the most widely used post office protocol, but not only one, fetchpop and the rest of the other do not implement pop2.rpop, or apop, and I have an interested in adding IMAP (Internet Message Access Protocol, the most recently design of the most powerful post office protocol) blur idea.
But I have a more theoretical reason to think that it will be a good idea, this is what I learned before Linux:
3. "Plan to abandon, no matter what, you will" "Fred Brooks," Mysterious People "Chapter 11)
Alternatively, you often understand the problem after the first time you implement a solution. The second time you may know how to do it well, so if you want to do it, you are ready to overthrow at least once.
Ok (I told yourself), trying to fetchpop is my first try, so I change it.
When I sent my first PopClient patch to Carl Harris on June 25, 1996, I found that he had already lost interest for PopClient for a while, some of which have some old mistakes. I have a lot of modifications to do, and we will get a consistency, I will take over this program. Unconsciously, this plan has expanded, no longer I originally planned to add a few secondary patches on the existing POP customers, I have to maintain the entire project, and my head is moving in my head. To cause a big change.
In a software culture that encourages code sharing, this is a natural road of engineering evolution, I want to point out:
4. If you have the right attitude, interesting questions will find you, but Carl Harris attitude is even more important, he understands:
5. When you lose your interest, your last responsibility is to pass it to a succeeded successor.
Even if there is no discussion, Carl and I know that we have a common goal to find the best solution. The only problem for us is whether I have a strong hand, he is elegant and quickly writted the program. I hope that I can do it when I am.
3. Having the importance of users
So I inherited PopClient. The same is that I inherited the user foundation of PopClient, the user is an excellent thing you have, not just because they show you that you are meeting, if you do the right thing, if you do Appropriate cultivation, they can become a cooperative developer.
Unix Traditional Another power is that many users are hackers, because the source excellent code is open, they can become efficient hackers, this is also pushed in the Linux world, which is shortened debugging. Time is extremely important. Under a little encouragement, your users will diagnose the problem, propose a revision proposal to help you improve how much speed than you expect. 6. Use the user as a collaborative developer is an indisputable way to quickly improve the code and efficient debugging.
The power of this effect is easy to be underestimated. In fact, almost all people in our free software world strongly underestimate how efficient to deal with system complexity until Linus has seen this.
In fact, I think Linus's smartest and most great job is not to create a Linux kernel itself, but invented Linux development model. When I had a view of this view in his face, he smiled and repeated a sentence. He often said: "I am basically a lazy person, relying on the work of others to get a score." Is it like a fox, or if Robert Heinlein said, it is too lazy without failure.
Looking back, you can see the success of the Linux method in the GNU Emacs Lisp library and the LISP code. Compared to the Emacs C kernel and many other FSF tools, the Evolution of the Lisp Code Base is a liquidity and user-driven, ideological and The prototype often rewrites three or four times before reaching the final stability, and often uses the Internet's loose cooperation.
In fact, I have the most successful work before Fetchmail, which is the three other people who have a similar Linux in the email. I have only seen one of the people (Richard Stallman), it is SCCS, RCS and the front end of the later CVS, provide "one-touch" version of the "one-touch" version for Emacs, which is from a miniature, rough others, starting to evolve, VC development, unlike Emacs It itself, but because the Emacs Lisp code can be quickly passed through the release / test / improvement process.
(The fsf is trying to put the code in GPL. There is an undesirable side effect, which makes FSF to take market set mode, because they think that everyone who wants to contribute the above code must get an authorization, Users who have been affected by the GPL from copyright law, and those who have authorized by the BSD and MITX Association do not have this problem because they don't try to keep the power that will enable people may be subject to challenge.
IV. I have released early, often released
Try to have an important part of the Linux development model as much as possible, most developers (including me) have been believed in large-scale projects, because the earlier versions are all incorrect versions, and you Do not want to spend the patience of the user.
This belief has strengthened the necessity of building a big church development method. If the goal is to let users see the mistake as possible, how can you not release it once every six months (or more often), and release What is hard to "catch insects" among a dog? The Emacs C kernel is developed in this way, the LISP library is actually the opposite, because there are some LISP libraries outside FSF control, where you can independently Released new and developed code versions during the Emacs release.
The most important thing is the ELISP library in Ohio, indicating the spirit of many of the characteristics of today's huge Linux library, but we rarely truly carefully think about what we are doing, or this library has pointed out the FSF construction church development. What is the problem of mode, I have made a serious attempt in 1992, I want to merge OHIO's large number of code to Emacs's official LISP library, and I have fallen into the political struggle and completely failed.
But after a year, after Linux is widely used, it is clear that some different healthy things are born, and the development model of Linus is just contrary to the construction of the church, and the library of SunSite and TSX-11 begins to grow, and many releases are promoted. All of this is driven by the release of frequent kernel systems that have not heard.
Linus uses its users as collaborative developers in all actual possible ways.
7. Early release, often release, listen to customer suggestions
The innovation of Linus is not this (this is a long-term tradition in the UNIX world), but it extends to the point that matches the complexity of what he has developed, and it is not rare to him once a day. And because he has cultivated his collaborative developer's foundation, it is really possible to use the Internet than anyone else.
But what is it going? Is it what I can imitate? Or this depends on Linus's unique genius?
I don't think so, I admit that Linus is an excellent hacker (how many people we can make a complete high-quality operating system kernel?), But Linux is not a fearful concept, linus is not (At least it is not yes) like Richard Stallman or James Gosling, in my opinion, Linus is more like a project genius, with the sixth feeling of avoiding mistakes and development failures, mastering discovery from A point to B point cost The smallest path declaration, indeed, Linux's entire design benefits from this trait and reflects the essential method of LINUS in essential conservative and simplified design.
If the fast release and take advantage of the Internet is not accidentally, it is the internal part of the insight part of the talents of Linus's minimum path, then he greatly enhanced? What kind of method he created?
The problem answered itself, Linus keeps his hacker users are often incentives and rewards: motivated self-satisfaction hopes, and the rewards are often in progress (or even daily).
When Linus is aiming at people who fight for the most investment debugging and developing, even the risk of unstable and lose the user's base is unstable and once there is a very trick error, Linus seems to believe in this:
8. If there is a large enough Beta test staff and collaborative developers, almost all issues can be quickly identified and corrected by some people.
Or more unfair to say: "If there is enough eyes, all the mistakes are shallow" (the eyes of the masses are bright), I will call this "Linus Law".
My initial expression is that every problem is transparent to some people. "LINUS, it is not necessarily that the person who understands and revised the problem is not necessarily," someone discovered the problem "," He said, "Another understanding it, I think it is a bigger challenge", but it is all income to happen.
I think this is the core difference between the construction of church and market model. In the programming mode of the construction of church mode, the error and programming problems are embarrassing, sinister, hidden phenomenon, spend a few months carefully, also Can't give you how much confidence is picked up, so a long release cycle, and the disappointment caused by the release of perfect version after long-term waiting. It is inevitable.
From the point of view of the market, on the other hand, we believe that the mistake is a lighting phenomenon, or at least when exposed to thousands of enthusiastic collaborative developers, when they test each new release, they are very It's going to be shallow, so we often release more corrections, as a useful side effect, if you don't have a clumsy modification, it will not lose too much. Maybe we should not be so surprised, sociologists have found a group of the same professional (or the same ignorant) observer a few years ago, the average view of the observer than in which one is randomly selected, and they call this "DELHPI effect" ", The proof shown by Linus is also applicable to a operating system, it also applies - Delphi effects can even defend the complexity of the core level of the operating system. I was inspired by Jeff Dutky (Dutky @ Wam.Umd.edu) indicating that the LINUS law can re-express as "debugging can be parallel". Jeff observes that the debugging work needs to be exchanged and the corresponding developer is exchanged, but it doesn't need it. A large number of coordination between debugers, so it did not fall into square complexity and management overhead encountered during development.
In practice, due to duplicate labor, the phenomenon of loss of loss in Linux world is not a big problem in the Linux world, an effect of "early release, often release strategy" is to use fast propagation feedback revision to reach repetitive labor. Minimum.
Brooks even made a more accurate observation associated with Jeff: "The cost of maintaining a widely used process is generally 40% of its development cost, and weird is the strong impact of this cost is subject to the number of users. More users find More errors (I emphasized).
More users find more errors because more users provide more test programs, when the user is collaborative developers, this effect is enlarged, and everyone who finds the wrong person has a slight feeling. And analytical tools, look at the problem from different perspectives. "Delphi effect" seems because this variant work is more accurate, this variants reduce duplicate labor while debugging.
So adding more Beta testers, although it is not possible to reduce the complexity of "the most" error from the developer's POV, it increases such a possibility, that is, someone's tools and problems do just match, and this error It is shallow for this person.
Linus has also made some improvements. If there are some serious errors, the version of the Linux kernel has processed on the number, so that users can choose to run a "stable" version, or get the risk of errors New features, this strategy has not been imitedly impendingmed by most Linux hackers, but it should be imitated, there are two choices that make both of them are very attractive.
5. When is the rose not a rose?
After studying the behavior of Linus and formed the theory of its success, I decided to test this theory in my project (obviously not so complex and ambitious).
But what I first did is familiar with and simplify PopClient. Carl Harris has a very good implementation, but there is no necessary complexity for many C processes. He regarded the code as a core as a data structure as a pair of code. The result is that the code is very beautiful but the data structure is very special, quite ugly (at least to this old LISP hacker), however, in addition to improving code and In addition to data structural design, it is still a purpose to override it, that is to evolve it for what I thoroughly understand, and the wrong thing to modify the error you don't understand is not a interesting thing.
The first month I just got the basic design of Carl's, the first major modification I did was joined IMAP support, I reorganize the protocol to a general driver and three method tables (corresponding POP2, POP3 and IMAP), this previous modification points out a general principle that requires a programmer (especially if there is no natural dynamic type supported by this):
9. Smart data structure and clumsy code are better than the opposite matching work
Fred brooks also said in his Chapter 11: "Let me see your [code], hide your [data structure], I will still confuse it; let me see your [data structure], then I It doesn't require your [code], it is obvious. " In fact, he is "flow chart" and "table", but after thirty-year terminology, things are still the same.
At this time (early September 1996, after six months from zero), I started to change the name - after all, it is not just a POP customer, but I hesitate, because there is nothing new. Designed, my PopClient version needs to have its own feature.
When Ftehmail learns how to transfer the mail to the SMTP port, things completely change, but first: I said that I decided to use this project to test my theory of behavior made by Linus Torualds, (you may Ask) How do I do this? The following way:
1. I will release it as soon as possible (almost never released every ten days; it is once every day during intensive development).
2. I add every person who discusses Fetchmail with I am in a beta table.
3. Whenever I publish that I have issued a notice to some people in the beta table, encourage people to participate.
4. I listen to the Beta tester and ask them to ask for design decisions and thank them for their patch and feedback.
These simple letters immediately received the return, I received some error reports, and the quality is enough to make the developers have been killed, and often accompanied by patch, I got a sense of reason, interesting mail And smart feature suggestions, which leads to:
10. If you treat your Beta tester as the most valuable resource, they will become your most valuable resources.
6. PopClient has become Fetchmail
The real turning point of this project is the Draft of the code sent to me to him. He forwarded the email to the client machine SMTP port. I immediately realized that the reliable implementation of this feature will eliminate all other delivery mode.
I have been modifying instead of improving Fetchmail because I think that the interface design is useful but it is too stunned, and it is full of rough fine options.
When I think SMTP forwarded, I found that PopClient tried to do too much, it was designed to be both a mail transfer agent (MTA) and a local delivery agent (MDA). Using SMTP forwarding, it can be relieved from the MDA's transaction to become a pure MTA, while sending mail to the local delivery program like Sendmail.
Since port 25 has already been reserved on all platforms that support TCP / IP, why should I worry about the configuration of a mail transfer agent or set up additional features for a mailbox? Especially when this means extracting mail Just like the SMTP email issued by a normal sender, this is what we need.
Here is a few teachings: First, the idea of SMTP forwarding is that I consciously simulate the biggest single return since I learned Linus, one user tells me this unusual idea - I need to do it just understand its meaning .
11. Think of the good idea is a good thing, find a good idea from your users, sometimes the latter is better.
Very interesting, you will soon discover that if you fully admit how much education from others, the whole world will think that all inventions are what you made, and you will become modest to your genius. . We can see that this is more obvious in Linus! (When I published this paper at Perl meeting in August 1997, Larry Wall sat in the front row, when I talked about the above view, he was excited I came out: "Yes! It is right! Buddy!" All listeners smiled, because they knew that the same thing happened to the inventors of Perl). So the work under the same spiritual guidance for a few weeks, I started to get a similar commendation from my users, I got a similar commendation, I put some such email, I will start in me. Doubt your life if you have valuable values. :)
However, there are two more basic, non-political, and all designs have universal education.
12. The most important and most innovative solutions often come from what you realize that your concept is wrong.
An interesting way to measure Fetchmail is the length of the project's beta test staff (FEGTCHMAIL), and there is already 249 members when creating it, and there is two to three each week.
In fact, when I redefined it in May 1997, this table began to shorten because of an interesting reason, several people asked me to remove them from the table, because Fetchmail has been working so good, they don't You need to see these messages! Maybe this is a part of a mature market style engineering.
I have been solving the wrong problem, putting POPCLIENT as a combination of MTA and MDA with many local delivery modes, the design of Fetchmail needs to consider a pure MTA, as part of a normal Internet mail path.
When you touch the wall in the development (when you find yourself hard to think about the next step), then usually don't ask yourself to find the correct answer, but ask if you have asked the correct question, you may need to re-construct a problem.
So, I re-construct my problem, very clear, the right thing to do is (1) put the SMTP forwarding support in the universal driver, (2) as the default mode, (3) finally separated All other delivery patterns, especially the options to file and standard output.
I hesitated on the third step, worried that PopDiant's long-term users were bothering, in theory, they can immediately transfer files or their non-Sendmail equivalents to get the same effect, in actual This conversion may be very troublesome.
But when I did this, prove that the benefits were huge, the redundancy of the driver code disappeared, the configuration is completely simple - no need to succumb to the system MDA and the user's mailbox, nor does it use whether the lower OS support files I am worried about locking.
Moreover, the only vulnerability of lost mail is also blocked. If you choose to deliver to a file, your email is lost, this will not happen in SMTP forwarding, because the SMTP listener will not return OK, unless the message can be delivered successfully or is delivered later.
Also, performance is also improved (although you will not notice in single execution), another non-negligible benefit of this modification is that the manual has become greatly simple.
Later, in order to allow some rare situations, including dynamic SLIP, I must return to let the user define local MDA delivery, but I found a simpler way.
All of this give us what inspiration? If you don't lose the efficiency, you must abandon the old features, Antonine de Saint-Exuper (before he became a classic children's book writer, a pilot and aircraft designer) have said :
13. "The best design is not no longer adding, but there is no something else." When your code is better and simpler, this is what you know is correct. In this process, Ftehmail's design has its own characteristics, which is different from its predecessor PopClient.
It is time to rename, this new design looks more like a Sendmail replica than the old PopClient, which is MTA, but Senmail is pushing and then delivered, and the new PopClient is pulled and then delivered. So, after two months, I rename it as Ftehmail.
Seven. Fetchmail grows up
Now I have a concise and creative design, work very well, because I use it every day, and have been growing Beta table, it makes me gradually understand that I am not engaged in only other people In useful work, I wrote a program that everyone with a Unix mailbox and SLIP / PPP mail is really needed.
Through SMTP forwarding, it becomes a potential "directory killer", which is far ahead of its competitors. This program can do so, but other procedures are not only given up.
I know that you can't really target or plan this result, you can only strive to design these powerful ideas, and the results will be inevitable, natural, destined, and the only way to get this idea It is to obtain many ideas, or think of other people's good ideas for other people, more than those who think of it.
Andrew Tanenban originally envisaged a simple Unix that is suitable for 386 for teaching, Linus Torvalels will take a step in the concept of Andrew's possible minix, growing into an excellent thing, the same (although smaller) I accepted the idea of Card Harris and Harry Hochheiser, which became more powerful, we are not the founders of the genius of people's romantic fantasy, but most science and engineering and software development are not done by the founders of the genius. This is the opposite of the myth of the circulation.
The result is always the reason for persistence - in fact, it is the success of each hacker! And they mean that I have to set my standard, in order to make Fetchmail be as good as I can imagine. I have to write code for my own needs, but also to include support for people outside the main page of my life, but also to ensure the simple and robustness of the program.
After achieving it, I first wrote the most important feature is to support multi-voter - remove the message from the mailbox of the centralized group of users, then route it into everyone.
The reason why I have added multi-investment function is because some users have been making it, because I think it can reveal the mistake from the code of the single, let me completely handle addressing, and this is Prove it. Correct explanation of RFC822 spent a quite long, not only because it is difficult for each individual part, but because it has a lot of interdependence of demanding details.
But multi-sports access is also an excellent design decision, so I know:
14. Any tool should be used in an expected manner, but a great tool provides you fade.
The Fetchmant multi-voted feature is not expected to provide mailing lists, alias extensions in SLIP / PPP client. This means that one of the people who use the personal machine does not have to continuously access the ISP alias file to manage a mailing list through an ISP account. Another important change in my Beta tester is to support 8 MIME operations, which is easy to do, because I have been carefully guaranteed that the 8-digit code is clear, not only because I foresee the needs of this feature, and because I Faithful to another nature:
15. When writing any kind of gateway program, multi-fees, try to interfere with data streams, never abandon information, unless the recipient is forced to do this! If I don't follow this guideline, then 8-bit MIME support will change Difficult and clumsy, now what I need to do is to read RFC 1652, which is a little bit of logic that produces the letterhead.
Some European users asked me to add an option to limit the number of messages per session (so they can control from expensive telephone network), I have been refused to do this for a long time, and I am still not very happy to it. But if you write code for the world, you must listen to the customer's opinion - this doesn't change your money without paying you.
8. Further education from Fetchmail
Before they returned to the general software engineering, there were several teachings from Fetchmail needed to think.
The RC file syntax includes an optional "noise" keyword, which is completely ignored by the scanner. When you put them fully extracted, the keyword / value pair is more readable.
When I noticed how much the declaration of the RC file started like a miniature command language (this is also why I changed the original "server" keyword to "poll").
It seems that this micro command language is more likely to make it easier to use. Now, although I am convinced that "making it into a language" in the Emacs and HTML and many database engines, I am not a common "English" syntax fanatic.
Traditional programmers are easy to control syntax to make it tryblented accurately and compact, there is no redundancy, which is a cultural tradition that computer resources remains expensive, so scanning strategies need to be as low-cost and simple, and 50% redundancy The moral English, it seems that it is a very unsuitable model.
This is not the reason I don't have to use the English grammar. I mention this is to overthrow it, in a cheap clock cycle and the core of the core, the conciseness does not go to the end, today is more convenient to people. It is more important than the machine than the machine.
However, there are several reasons to remind us to be careful, one is a complexity overhead of scanning strategies - you don't want to turn it into a huge wrong source and let the user confused, the other is to try to make the language surface similar to The traditional language is as confusing (you can see this on many 4GL and business database query languages).