We evaluated millions of patents – and consistently forward citations were the biggest predictor of high value patents. In our last article we discussed why forward citations are relevant, and the importance of remaining patent term. Now we’d like to consider the remaining three factors we use to rank patents, and why they may be of use in helping to eliminate less useful patents quickly and efficiently.
Independent Claim Count (Adjusted by Means Claims)
We hypothesized that paying for additional claims (three are included in the basic filing fee) would be highly correlated with value. Our analysis focused on looking at claim counts for four primary sets of patents: (i) a set of all issued patents from 2005-2014, (ii) a set of litigated patents from the same period, (iii) a set of patents from the brokered market that were sold from 2009-2014, and (iv) the representative patents from brokered patent packages.
As predicted, having more than three claims was highly correlated to the probability of the patent being litigated, sold, or being listed as the representative patent for a sales package, e.g. the most important patent in the package.
We decided to model this ranking factor again by comparison between the prevalence of the claim count in the litigated patent (set ii) and the larger set of US issued patents (set i):
However, we know that the number of independent claims alone is insufficient consideration if, for example, all of the independent claims are formed as means-plus-function claims (35 USC §112(f)). At least in the United States, given the present case law, such claims generally have less value for our clients.
We analyzed the prevalence of means claims in our data sets (sets i-iv discussed above) and then developed a number of claims rank adjustment factor based on the number of means claims. By analyzing the different data sets, we arrived at an adjustment factor that a means claim generally has the value of 1/10th of a non-means claim. We did, however, provide an exception that if there were at least 5 independent non-means claims; no adjustment was done to the claims rank.
We then back-tested this ranking by looking at approximately 5000 randomly selected patents with issue dates from 2005-2014 and looked at the distribution of the new ranking factor. Notably, this ranking factor will only-lower the rank of ~12-13% of patents.
Claim 1 Word Count
Historically, our ranking heuristic viewed claim 1 word count as one of the more significant ranking factors and in put a heavy emphasis on shorter claims. However, when we analyzed the multiple data sets (sets i-iv discussed above) there was no significant variation between any of the sets that are proxies for higher value (litigated, sold, representative patent) and the baseline set of all patents.
Instead, we realize now that claim 1 word count is better viewed as a component to remove from consideration applications with extreme word counts. We used the data from litigated patents (set ii) as a guide in removing extreme claim 1 word counts from consideration.
Thus, as you can see the new ranking factor heavily down ranks patents with a word count for claim 1 less than 25 words or more than about 250 words. We identified a range from 63-163 words as being a sweet spot for the length of litigated claim 1 word counts. (Note, in a future version of the ranking system we might evaluate the shortest independent claims.)
Family Size and International Filings
Does family size matter when looking for the better patents? Intuitively, family size and diversity of international filings should be good indicators of value. We hypothesized that like independent claim count, the investment to produce a larger patent family and file international patents would correspond to greater value. However, we found the impact was less significant than even the word count of claim 1 – only a 10% contribution to the overall weighting.
Our new ranking system provides a maximum of 10 points for family size and international filing size:
- Up to 5 points for family size scaled linearly based on family size ranging from 0 to 12 (family with over 12 INPADOC publications is treated as 12 publications)
- Multiply the family size rank by:
- 2 if there is an issued EP, JP, CN patent
- 5 if there is a published EP, JP, CN patent
- 25 if there is a PCT publication and it is <2.75 years from priority
- 25 if <1.75 years from priority (adjust for risk of no data)
- 1 otherwise
Let’s begin by making it clear that these metrics needed to be combined based on weighting factors to create a balanced total score. While doing this, there were two major considerations. A properly weighted system should create a large ranking spread between interesting and uninteresting patents, but it should also use a mix of the metrics in order to give a more rounded perspective.
We limited the weighting factor for each metric to 10-to-60%. We then repeatedly ranked sets of random patents and known valuable sets with more than 400 different weighting factor possibilities. By comparing the possibilities that had the largest spread between the median patent ranks of each set we were able to see trends. We averaged the top 10 weighting factor possibilities to get our baseline factors, and then adjusted these slightly upon a manual review.
We then tested the system against smaller sets of patents, which we had previously reviewed. The automated ranking system was able to consistently rank the focus patents of each set highly. This confirmed that the automated ranks would allow us to quickly identify the patents that are most likely to be useful and also eliminate a number of less interesting patents quickly as well.
We set out to use the USPTO data on issued US patents (formerly hosted on Google Books but now directly hosted by the USPTO at https://data.uspto.gov/uspto.html) to refine our ranking system to provide a fully transparent, data-based ranking that can intuitively be explained to clients.
We successfully built a parser for the USPTO XML data set, using it to analyze the characteristics of US patents (issuing from 2005-2014) and compare different subsets of that data. This included leveraging our unique database of over $7B worth of brokered patents, allowing us to quickly highlight those of most interest to our buying clients.
The following table summarizes our ranking factors with Excel-like formulas (click to enlarge):