We’ve titled this paper Round 2 because it feels like we’ve been through a boxing match regarding our first article on the subject. Despite the wounds, we felt that the first article brought up a number of interesting issues and generated a few very useful discussions, causing us to rethink our initial conclusion.
First, we’d like to thank the people who commented and gave useful information, particularly those who pointed to relevant links and applicable documents. That was much more helpful than the personal attacks posted in response to our article.
Second we want to again state that we did not accuse anyone of copying, of theft, of illegal activity, of guilt, or of infringement. We tried to walk the line of fairness. Some readers accused us of making those accusations. Interestingly, some readers claimed we were unfairly accusing Oracle, some claimed we were unfairly accusing Sun, and others claimed we were unfairly accusing Google. Maybe that means we succeeded in walking that line. (We did not actually accuse any party.)
We did say, however, that it appears Oracle missed a number of files that appeared to have been copied from Sun, but distinctly noted, “Not knowing all of the details of the case, there could be issues that we’re not aware of.” We realize that this was a long case involving many people, much code, and a very large pretrial record containing a huge number of documents. As we explained in the beginning of the article, we considered this an interesting exercise and wanted to share our results.
Readers did point out some issues in our article that we would like to correct. First, we made some statements regarding copyright that are not completely accurate. A work can be jointly owned by two or more copyright holders who then have the right to individually assign nonexclusive rights without the permission of the other copyright holders. This is not typically done by companies developing code, because it effectively gives away the copyrights. It is more typically done when a company accepts code developed by an outside entity. In fact, as was pointed out by one reader, Sun has an agreement called the Sun Contributor Agreement (SCA) that specifies that any person who contributes code to a Sun-managed project gives Sun joint copyright in the code. This is an interesting way for Sun to ensure that code contributed to any of its projects can be used without restriction by Sun without copyright issues. However, note some points in the agreement that we will discuss later:
These terms apply to your contribution of materials to a product or project owned or managed by us …
The term ‘contribution’ means any source code, object code, patch, tool, sample, graphic, specification, manual, documentation, or any other material posted or submitted by you to a project.
you agree that each of us can do all things in relation to your contribution as if each of us were the sole owners, and if one of us makes a derivative work of your contribution, the one who makes the derivative work (or has it made) will be the sole owner of that derivative work.
Also to be 100% accurate, two identical (or nearly identical) files could be copies of each other or of some third party file. Given that we could not find evidence of any third party at that time, we felt it was safe to conclude that one file was copied from the other.
Java package java.util.concurrent
As one reader pointed out, the Java package java.util.concurrent was not asserted by Oracle. Oracle expert John C. Mitchell in his expert report dated July 29, 2011 states in a footnote on page 61:
Oracle has chosen not to assert copyright infringement of several other Java SE packages, in some cases because Oracle uses these packages under license from third parties or allows third parties to utilize these packages under permissive terms. These packages include: java.math, java.util.concurrent, java.util.concurrent.atomic, java.util.concurrent.locks, javax.xml, javax.xml.datatype, javax.xml.namespace, javax.xml.parsers, javax.xml.transform, javax.xml.transform.dom, javax.xml.transform.sax, javax.xml.transform.stream, javax.xml.validation, and javax.xml.xpath.
Our mistake was that we searched the reports for the names of the files that we found to be matching rather than for the package name. Because of this approach, we did not notice that the entire package was not asserted by Oracle against Google.
As we pointed out in the article, Doug Lea, professor of Computer Science at the State University of New York at Oswego, headed Java Specification Request group JSR-166 of the Java Community Process (JCP) that created the original java.util.concurrent package. It would be interesting to know why Oracle decided not to assert this package, though. According to the SCA, and according to copyright law, Doug Lea and the other members of his group would only hold the copyright for those modifications that they made to the files, not to anything created solely by Sun. Also if Sun made any further changes to the files, then Sun recovered full, exclusive copyrights in the changes it made to those derivative files according to copyright law. And, if our reading of the SCA is accurate, Sun would actually hold full, exclusive copyrights in the entire content of the derivative files according to the SCA (though copyright law may trump this agreement). However, the point of this paper is not to conjecture why things may have occurred.
The mime4j files
That still leaves the 4 files we found from the mime4j project used in Android. The Java versions of these files have the copyright notice:
* Copyright 2004 Sun Microsystems, Inc. All rights reserved.
* SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms.
As we pointed out before, the Sun copyright notices in the file comments are for 2004 while the earliest release of software for the open source mime4j project is May 3, 2005. The dates in the comments of the Sun files go back to 2003. Now, having a copyright notice in these files, in and of itself, does not prove that Sun owned the copyright. Even registering a copyright with the Copyright Office is not proof of copyright ownership. And placing a date in a comment in a file does not prove that the file was created on that date.
One of the readers noted that the mime4j files had comments that they were created by the tools Java Compiler Compiler (JavaCC), a program that automatically generates Java code. Our research subsequently showed that JavaCC was initially developed at Sun but later given to the open source community. The reader noted the misspelling of the word “followng” in a comment in the Android and Java files we had found also appeared in files used in the code for JavaCC. Another reader pointed us to the World Wide Web Consortium (W3C). After some searching we found two of the files, ParseException.java and TokenMgrError.java at the website, which is an HTML parser developed by or given to the W3C. These files were dated October 19, 1999, several years before the Sun copyright notice dated 2004.
Regarding the remaining two Android files, JJTAddressListParserState.java and SimpleNode.java, that are nearly identical to Java files JJTParserState.java and SimpleNode.java respectively, we have been unable to track them down. The comments in the files state that they were generated by JJTree, a preprocessor for JavaCC that automatically generates Java code, which leads us to now believe that these were automatically generated, which would rule out copying though that still does not explain why Sun has a copyright notice in them.
Speculation and Theories
As we mentioned, some reader comments were very helpful for pointing us to relevant information or allowing us to refine our research. There were also a few interesting arguments raised by the comments to the previous article that we would like to address. First, there is no such thing as innocent infringement. If a first party owns a copyright in a work and a second party unknowingly uses a right to which it is not entitled, such as by using the work under license from an infringer without knowledge of the infringement, that still constitutes infringement.
Percentages don’t matter. There is no aspect of copyright that says that it is necessarily fair use to copy a certain percentage of a work or less. Therefore, the argument like the one that “only 9 files out of 12,262… proves… Google did not copy code” simply has no basis in the law. If any single file was copied that would mean that 100% of that file was taken. To focus on the totality of the work misses the entire copyright and fair use analysis. As the U.S. Copyright Office explains on its website:
How much of someone else’s work can I use without getting permission?
Under the fair use doctrine of the U.S. copyright statute, it is permissible to use limited portions of a work including quotes, for purposes such as commentary, criticism, news reporting, and scholarly reports. There are no legal rules permitting the use of a specific number of words, a certain number of musical notes, or percentage of a work. Whether a particular use qualifies as fair use depends on all the circumstances. See FL 102, Fair Use, and Circular 21, Reproductions of Copyrighted Works by Educators and Librarians.
Furthermore, in 1985 the Supreme Court of the United States recognized in Harper & Row v. Nation Enterprises that the copying of only 300 words from an entire book constitutes copyright infringement. In Harper the Supreme Court also determined that the fair use defense did not save the copyist. So whether nay-sayers choose to believe it or not, there is no right to literally copy even a small amount of material.
Theories about how something could have happened, without any evidence, have little weight. Theories based on the evidence available carry significant weight, as noted on the Ladas and Perry LLP website:
Since there is seldom direct evidence of copying (witnesses who actually saw the defendant copy the work, for instance), a copyright owner may prove copying through circumstantial evidence establishing that the defendant had access to the original work and that the two works are substantially similar.
Given the evidence we had at the time, we suggested the most reasonable conclusion. The legal system is adversarial for just this reason—one side presents evidence to support one theory and the other side attempts to present evidence that contradicts that theory. That is what happened here, and we appreciate those who did so in a cordial manner. We had no stake in this case. We felt the exercise was useful to us and we hoped it was educational to the readers.
We welcome further comments and criticism, though we do ask that commenters stick to the merits of the arguments. We appreciate facts about what actually occurred, references to documents, and links to websites, but not speculation about what might have occurred in some highly unlikely scenario. Fortunately, cases like these are won by presenting facts and drawing reasonable conclusions based on those facts. In that spirit, we welcome your feedback.
We would like to acknowledge attorney Steve Wu of Cooke Kobrick & Wu LLP for his discussions and for his help locating trial documents for this case. We would also like to thank attorney Gene Quinn of IPWatchdog for his review of the article and his suggestions, and expert witness Andy Finkel for his review and suggestions.