CFL Software

Patent Reader

Patent Reader

CFL Patent Reader includes several distinctive features in the area of patent searching, not least the complete absence of a need for keyword/boolean searching.  Traditional searching is done with keywords, in combination or proximity; semantic searching is done by expanding to related concepts, which are largely found by occurring in similar contexts. In contrast, the key concept behind Patent Reader is very simple. It reads the full content of the search document and compares the terms it finds with  all the content of each patent or other document in the dataset. This is the system underlying our long-established plagiarism detection programs.  

Reading = time saving

This means there is no work in designing a suitable query by a user because the program builds its own collection of terminology from the text.  By searching on everything, tangentially related material is automatically relegated to the lower end of any report, and unrelated material does not appear at all.  It also means that the search is guaranteed to be comprehensive and the report complete within the dataset used for comparison.In addition, the Comparison tab takes you directly into the relevant claims with full markup of the terms matched by the program, so there is no requirement for reading the full patent and finding the relevant sections yourself.

Comprehensive search

We default to searching the claims, as this is the primary source of problems or opportunity for FTO and Invalidity searching in particular, but any or all of the other text sections can be searched in the same way. This methodology is guaranteed to find either the primary search terms and/or the context terms of the search document.  It can do this because of the power and speed of contemporary CPUs and the full exploitation of multi-threading capabilities. 

Ranked results

Another contrast with national search engines in particular is that the results are always ranked by the level of similarity to the starting query, and full information on the terms found is shown.

Since February 2018 we have been using the high quality ST.36 Lighthouse IP dataset.  The program actually runs from the claim section collected for the full set, but in addition we can collect any of the metadata or full text for reporting purposes.


The example shows both aspects of the program.

The upper section shows that you can load a file or copy-and-paste any text into the search box, which you can then edit.  Simply clicking a Find Terms button identifies the terms for you, which you can also edit. And you can select a term or terms that you must have present.

The central section shows a report, where the ranking on the left is the count of the matched terms shown on the right for each patent or application. 

The lower section is an illustration of the interactive exploration capabilities of the program, which shows you exactly where in the claims the matching has occurred.  Each time you select a patent this content is immediately updated and you can see the full claims simply by clicking a Show All box.  You can do the same with any of the citations held by the program.

All matches are linked to the full patent of the appropriate jurisdiction.

The report can be saved as an HTML file which cam be views in a browser or imported into a spreadsheet, if required.

API available

Patent Reader is an API, so these screenshots are simply  indicative of what can be done with the output.  The programs use UTF-8 plain text as input, as found in XML files as well as regular literature, and can return HTML, XML, JSON or whatever is desired by an end user.  


  • Search input is normally a description of the invention or an outline application, although it is possible to use a patent or application claim instead.

  • The program performs automatic identification of all terminology in the search document.

  • Searching on all those terms; in a full patent this can be over 2000 terms.

  • Results are ranked by number of search terms found in the full patent.

  • Filters can be applied for classification codes or parts of patents.

  • The illustrated report gives primary CPC classification, assignee, and title in addition to the patent number and date of publication, but additional data, such as inventor, optional with abstract or first claim can be included.

  • Immediate display of each matched patent with the search document terms.

  • Parallel display of the matched sentences, with location in the patent identified (Abstract, Brief Summary, Claims, Description).

  • Automatic collection of the citations for each matched patent.

  • Ability to compare any of the citations with the search description, or the citing patent, with parallel display of the results as for patents.

  • Comparative vocabulary use for each comparison pair.

  • Unified operation across US and EPO and WIPO datasets.

  • Searching in original languages, including Chinese, Japanese and Korean.


Get in touch

For information about any of our products and services, or to discuss your requirements.