CFL Software
iStock_000020473094_Large.jpg

Patent Reader

Patent Reader

CFL Patent Reader includes several distinctive features in the area of patent searching, not least the complete absence of a need for keyword/boolean searching.  The key concept is very simple.  Traditional searching is done with keywords, in combination or proximity; semantic searching is done by expanding to related concepts, which are largely found by occurring in similar contexts. In contrast, Patent Reader literally reads the content of the search document and all the content of each patent or other document in the dataset. We default to searching the claims, as this is the primary source of problems or opportunity for FTO and Invalidity searching in particular, but any or all of the other text sections can be searched in the same way. This methodology is guaranteed to find either the primary search terms and/or the context terms of the search document.  It can do this because of the power and speed of contemporary CPUs and the full exploitation of multi-threading capabilities.  It is the system underlying or long-established plagiarism detection programs. 

This means there is no work in designing a suitable query by a user because the program builds its own collection of terminology from the text.  By searching on everything, tangentially related material is automatically relegated to the lower end of any report, and unrelated material does not appear at all.  It also means that the search is guaranteed to be comprehensive and the report complete within the dataset used for comparison.

Since February 2018 we have been using the high quality Lighthouse IP dataset for enhanced R&D, in particular in defaulting to displaying CPC codes and including the abstract and first claim as can be see in our latest output PDF File.  Here we are searching in the years 2012-2017 in the EPO, WIPO and USPTO datasets.

The report illustrated is the result of starting with the ultrasonic surgical shears illustrative sample from the USPTO Examination Guide. The program is using only the words in the sample with no user input other than the filename.  The same can be done with the entry of a patent or application number in which case by default the claims are found and read by the program, and used for comparison.  Because it is an HTML file it can be automatically loaded into Microsoft Excel to use the comprehensive sorting and searching facilities in that program, as well as being fully searchable in a web browser.

In the example PDF the ranking on the left shows the number of terms Patent Reader has found in common between the sample and each document shown.  You can see that all the documents returned have some relevance to the initial query, and these are the only ones returned by the program from all the patents in the six years in the test set. And you can see that if this had been your idea, then you would quickly find out that Ethicon Endo-Surgery, Inc. had got there before you!  You can also see that this is predominantly a US activity.  All the EP and WO patents have also been searched.

This link shows the PDF we took to the PIUG Annual Conference in Atlanta 2017, before we re-named it.  At PIUG we were one of only two first-time attendees who were also speakers, presenting our companion program SpanMatch.  This shows some of the additional outputs available from Patent Reader

Patent Reader is an API, so these screenshots are simply  indicative of what can be done with the output.  The programs use UTF-8 plain text as input, as found in XML files as well as regular literature, and can return HTML, XML, JSON or whatever is desired by an end user.  The output displayed on page 2 of the PDF is actually working directly from the XML source files where the matched patents are found.

Here are some of the other features:

  • Search input is normally a description of the invention or an outline application, although it is possible to use a patent or application number instead.
  • The program performs automatic identification of all terminology in the search document.
  • Searching on all those terms; in  a full patent this can be over 2000 terms.
  • Results are ranked by number of search terms found in the full patent.
  • Filters can be applied for classification codes or parts of patents.
  • The report gives primary CPC classification, assignee, inventor and title in addition to the patent number and date of publication, with optional with abstract and first claim (or any other metadata required).
  • Comparison of each matched patent with the search document.
  • Parallel display of the matched sentences, with location in the patent identified (Abstract, Brief Summary, Claims, Description).
  • Automatic collection of the citations for each matched patent.
  • Ability to compare any of the citations with the search description, or the citing patent, with parallel display of the results as for patents.
  • Comparative vocabulary use for each comparison pair.
  • Unified operation across US and EPO  and WIPO datasets.
  • Searching in original languages, including Chinese.
 

Get in touch

For information about any of our products and services, or to discuss your requirements.