At the Core: Patent Examiner and Art Unit Data Explained

Editorial Note: Last week I published an article titled Patent Strategy: Discovering Crucial Patent Examiner Data.  There were numerous, substantive questions posed about the PatentCore software, so I invited co-creator and patent attorney Chris Holt to address those questions in the article that follows.


We are very pleased with the interest generated in the PatentCore information system.  The feedback has been overwhelmingly positive.  A number of people have had questions about some of the specifics of the system.  The purpose of this article is to address the most common questions possed by IPWatchdog readers in response to a recent article; Patent Strategy: Discovering Crucial Patent Examiner Data, which was published on January 4, 2012.

One of the most persistent questions asks us to address the inspiration for and reasoning behind the system. As patent attorneys ourselves, we have prosecuted patents for many years for a wide variety of clients.  As committed professionals, we are constantly trying to improve our own performance to ensure that we are delivering quality services to our clients.  That was, quite frankly, the inspiration for the system.  As we began to explore the concepts behind PatentCore, we realized that we could bring value not only to our clients but to the patent community as a whole.  The primary goal of PatentCore is to improve the patent process for patent examiners, patent attorneys and, therefore and primarily, for our clients.

Early in my career, I encountered a series of approximately 20 patent applications that were assigned to a small number of different art units.  During the time it took to bring the cases to resolution, I kept detailed notes of my experiences prosecuting each case.  It eventually occurred to me that the information I’d collected might be useful to other prosecutors working with the same examiners and/or art units.  I wondered whether my colleagues, by reviewing my notes and gaining insight from my experiences, might be able to accomplish resolution more effectively and efficiently.  However, the subjective and anecdotal nature of my notes limited their practicality.  Recognizing that fact, I began to consider strategies for facilitating ways in which practitioners could more effectively share their prosecution experiences with one another. The result of years of study and consideration is PatentCore.

The fact of the matter is that each examiner and art unit operates somewhat differently.  When patent professionals are provided with insight into the trends or tendencies of those with whom they are working, they are relieved of the “trial and error” methodology traditionally used to guide prosecution.  PatentCore is designed to create better and more efficient communication between the applicant (or the applicant’s representative) and the examiner, which in many cases should lead to better patents obtained sooner.  Our purpose is to improve the efficiency and quality of the prosecution process by empowering patent professionals with useful frames of reference.

In no way does compiling and analyzing the data “cheat” the system.  In most litigation contexts, it is common for one party or another to thoroughly research the history and tendencies of a judge hearing the case.  Such research might involve reviewing prior opinions, interviewing other lawyers about a particular judge’s likes or dislikes, attempting to determine the judge’s views on Rule 12 motions, and perhaps even developing an understanding of the judge’s docket and/or backlog.  These analyses are simply part of good practice in law.

The PatentCore information system was developed to enhance the ability of patent attorneys to better develop strategies and arguments for their clients, given the particular examiner they have.  A good patent practitioner would never consider the information in PatentCore as a sole source upon which to base decisions.  In fact, a seasoned patent practitioner recently advised us that the most important consideration is the art the Examiner finds, and that consideration of which Examiner is assigned to the case is the second most important factor.  The information from the PatentCore System should be considered and can inform decisions and strategy and using the information seems to be good lawyering from our point of view.

Another commonly asked question is where we do we get the underlying data that our system analyzes.  We are more than happy to share information about the sources of our data.  Our statistics are based in large part upon machine readable datasets provided directly by the USPTO to the general public.  We say “in large part” because some of the data comes to us from Google, but Google acquires that data directly from the USPTO (with the USPTO’s blessing, of course).

The Google data that we use is obtained as the result of a gradual crawl of patent documents, including image file wrappers, from the USPTO’s public PAIR (Patent Application Information Retrieval) site.  According to Google, this crawl operates continually.  The crawl retrieves both already-submitted documents and new documents as the USPTO makes them publicly available.  As the USPTO makes the data available, Google acquires it, and then we acquire it, process it, and use it to feed the PatentCore data system.

Currently, the Google coverage includes about 1,422,031 patent applications, including (according to Google) most of the published applications in the following ranges:

08000001 – 08033366
08100000 – 08134568
08200000 – 08235165
09800000 – 09894396
09900000 – 09987194
10000001 – 10076633
10100000 – 10181211
10200000 – 10273817
10300001 – 10369900
10400000 – 10461394
10500000 – 10568730
10600001 – 10666883
10700000 – 10767679
10800001 – 10862312
10900000 – 10966609
11000001 – 11073148
11100000 – 11141140
11200000 – 11239876
11300000 – 11339059
11400000 – 11439539
11500000 – 11538547
11600000 – 11638977
11700000 – 11739382
11800000 – 11837665
11900000 – 11934601
12000001 – 12039464
12100001 – 12140725
12200000 – 12219134
12300000 – 12370513
12400000 – 12471703
12500000 – 12570144
12600000 – 12667374
29000001 – 29026192
60000003 – 60001535
90000001 – 90006696
95000001 – 95001409

The statistics and analysis in the PatentCore system account for (as of the time I am writing this) approximately 1,112,733 of the 1,422,031 applications in the Google system.  We have fallen a bit behind Google due to some hardware upgrades and data transfers that we are in the process of making to accommodate growth.  We will be caught up with Google again mostly likely by the end of this month.  Then, we will keep pace with Google going forward.

Thus, the PatentCore information system is most definitely dynamic. The system that one logs onto one day will be deeper and more accurate than the one logged onto the day before, as will be true for each succeeding day.  Further, we have been and continue to be dedicated to the task of adjusting algorithms as necessary to improve and protect the accuracy and integrity of the data. We are and will always be open to feedback from both inside and outside of the patent office. To facilitate such feedback, our contact information is provided on the PatentCore website.

Along the same line, some have been curious about the amount of data required for a given examiner to make the related statistics meaningful.  The first statistic on any examiner information screen in the PatentCore system identifies the total number of applications analyzed for that examiner.  A wise user will factor in the number of cases analyzed.  If the number seems low, then the information should be weighted and it may be wise to return to the information at a later date once more data has been analyzed.

That said, even fairly limited data provides at least a brief profile of the Examiner and his/her decisions.  For instance, a recent PatentCore report for an Examiner showed only 45 applications had been analyzed at that point.  While that may seem a small sample, further analysis showed that, of the 45 cases, 23 were pending and 22 had been disposed of.  Of the later 22 cases, 6 had been abandoned following a decision on appeal.  It was useful information to know that so many of the examiner’s decisions had been upheld on appeal.  The data seemed to have implications that might reasonably influence a decision regarding whether to suggest appeal or to file another RCE.

To summarize, while the System may not have extensive data for every Examiner, it is possible to extrapolate information and to make a decision about how much weight to give the data in making a decision.  After all, the PatentCore system is only meant to inform decisions and to provide data to be considered along with other factors specific to a case and its unique circumstances.

Another area of inquiry, presumably from patent examiners, addressed inherited cases.  That is, what happens when an examiner inherits one or more cases from another examiner?  From our standpoint, there are three possible ways to handle this.  The first is to attribute the prosecution to the first examiner.  The second is to attribute it to the current examiner and the third is to throw the case out.  The PatentCore team decided to go with the second option.

One reason for that decision lies in the fact that the system reflects an outcome-centric model of prosecution lifecycles, meaning there are two primary outcomes for any given case: allowance or abandonment. Since the last examiner on the case ultimately controls the outcome, it seemed sensible to attribute the prosecution to that examiner.

As designers, we did conduct analysis designed to determine whether using this option would unduly skew the data.  We came to a number of conclusions.  First, analyzing numerous examples indicated stability in data trend validity.  The vast majority of Examiners did not appear to inherit enough cases (from Examiners who had behavior significantly different from their own) to change their statistics materially.  In addition, because the system is continually updating and gathering more data, the inherited cases will become less and less significant, even to those examiners with relatively large percentages of inherited cases.

Finally, there have been some questions about how the rejection-specific statistics (meaning ONLY the non-final office action table and the final office action table) are generated.  These statistics are based on an automated analysis of several million office action documents. In particular, optical character recognition is utilized to extract text from the documents and then classifiers are applied in an effort to determine which documents include which rejection types. These processes are not as precise as the algorithms utilized to generate all of the other statistics in the PatentCore system – in particular the statistics outside of the two tables. Due to the higher margin of error, a relatively high number of analyzed office actions is required to surface valid trends in the different statutory sections.

The difference in relative accuracy between the rejection-specific statistics and all of the other statistics should be understood by all PatentCore users and probably needs to be more clearly communicated within the system itself. It would be a shame for the fuzzy nature of the rejection-based statistics to take anything away from the not particularly fuzzy and much more accurate algorithms and processes utilized to generate all of the other statistics in the PatentCore system – like I said, all of the other statistics outside of the two tables. With that in mind, the rejection-specific statistics (i.e., the two tables) have been removed for the time being.  If a subscriber would like for rejection-specific statistics to be immediately turned back on, that can easily be done on a user-by-user basis.  Just send an email to and request that the rejection-specific statistics be turned on for your username/password.

The rejection-specific statistics will soon be added back in for everyone.  There will likely be a box placed around these components in order to make it clear that they are “different” than the other statistics.  Of course, a clear explanation of the relative accuracy of all of the PatentCore statistics will also be provided.

The bottom line is that we welcome the opportunity to discuss the PatentCore system with the patent community.  We want to stress our commitment to open dialog regarding all aspects of the system and we look forward to working with practitioners and with examiners, with the common goal of improving the operation of the patent system as a whole – a goal that will ultimately better serve all parties involved.

About the Author

Christopher Holt is a registered patent attorney with experience preparing and prosecuting U.S. and foreign patents in the electrical and mechanical fields, as well as for computer hardware and software inventions. Chris also specializes in preparing and prosecuting U.S. and foreign trademark applications, and enforcing trademark rights. He received his law degree from the University of Kansas School of Law and holds an undergraduate degree in physics from Southwestern College (Winfield, Kansas). Chris is licensed to practice law in both Minnesota and Kansas.


Warning & Disclaimer: The pages, articles and comments on do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of Read more.

Join the Discussion

3 comments so far.

  • [Avatar for Anon]
    January 14, 2012 08:09 pm

    First off, thank you for the trial run.

    Second one suggestion that still seems to linger in my mind about the tool set of the individual examiner graphs.

    Bring them back.

    But bring them back to show each particular examiner’s work.

    It seems that you are trying too much on this – people don’t want some fancy algorithm, want they want is to know the exact history of the examiner. IF an examiner is listed as having 50 cases, then the graphs should show a breakdown of those fifty cases – nothing more, nothing less.

    In that sense, the percentages should come out to be either 100% of that examiner’s rejections (all accounted) or should come out to the percentage that matches his rejections less allowances on first action.

    Third, I am truly puzzled that anyone would make a comment that knowing about the examiner you are facing is somehow “cheating.” I can only think that someone saying this is unhappy that they are being measured at all. To the extent that these records are in the public eye, government employees do not have the luxury of NOT having their records there for anyone to aggregate. I would suggest (if permitted by the USPTO) a section in which the examiner can make comments on his own performance.

    Fourth – dealing with mixed records: give each examiner his or her own due. If a record is transferred, only count the new examiner’s actions against the new examiner. IF the data is truly crawled (and available in detail, I would prefer eliminating ANY unnecessary conflation. Such only takes away from the accuracy of knowing who you are dealing with (and that, to me is the most value). I would also (data detail permitting) to also break down an individual examiner’s historical performance based on his or her signature authority, as well as aggregate the data by SPE.

  • [Avatar for Rick]
    January 11, 2012 10:22 am

    Thanks very much for responding to the comments. Thanks also for taking steps to address the issue about the rejection statistics. I think you have a great basic idea here and look forward to seeing it reach its full potential. Cheers!

  • [Avatar for ceej@y]
    January 10, 2012 09:14 pm

    Seems like a great idea.