Patent Data Client

clients.patent_data - Client for USPTO patent data API.

This module provides a client for interacting with the USPTO Patent Data API. It allows you to search for and retrieve patent application data.

class pyUSPTO.clients.patent_data.PatentDataClient(config=None, base_url=None)[source]

Bases: BaseUSPTOClient[PatentDataResponse]

Client for interacting with the USPTO Patent Data API.

ENDPOINTS = {'download_application_document': 'api/v1/download/applications/{application_number}/{document_id}', 'get_application_adjustment': 'api/v1/patent/applications/{application_number}/adjustment', 'get_application_assignment': 'api/v1/patent/applications/{application_number}/assignment', 'get_application_associated_documents': 'api/v1/patent/applications/{application_number}/associated-documents', 'get_application_attorney': 'api/v1/patent/applications/{application_number}/attorney', 'get_application_by_number': 'api/v1/patent/applications/{application_number}', 'get_application_continuity': 'api/v1/patent/applications/{application_number}/continuity', 'get_application_documents': 'api/v1/patent/applications/{application_number}/documents', 'get_application_foreign_priority': 'api/v1/patent/applications/{application_number}/foreign-priority', 'get_application_metadata': 'api/v1/patent/applications/{application_number}/meta-data', 'get_application_transactions': 'api/v1/patent/applications/{application_number}/transactions', 'get_search_results': 'api/v1/patent/applications/search/download', 'search_applications': 'api/v1/patent/applications/search', 'status_codes': 'api/v1/patent/status-codes'}
__init__(config=None, base_url=None)[source]

Initialize the PatentDataClient.

Parameters:
  • config (USPTOConfig | None) – USPTOConfig instance containing API key and settings. If not provided, creates config from environment variables (requires USPTO_API_KEY).

  • base_url (str | None) – Optional base URL override for the USPTO Patent Data API. If not provided, uses config.patent_data_base_url or default.

download_archive(printed_metadata, destination=None, file_name=None, overwrite=False)[source]

Download Printed Metadata (XML data).

These are XML files of the patent as printed. Auto-extracts if the server sends a TAR/ZIP archive.

Note

See also download_publication() for a clearer method name with identical functionality.

Parameters:
  • printed_metadata (PrintedMetaData) – ArchiveMetaData object containing download URL and metadata

  • destination (str | None) – Optional directory path to save the file

  • file_name (str | None) – Optional filename. If not provided, uses Content-Disposition header

  • overwrite (bool) – Whether to overwrite existing files. Default False

Returns:

Path to the downloaded file (extracted if was in archive)

Return type:

str

Raises:
download_document(document, format=DocumentMimeType.PDF, destination=None, file_name=None, overwrite=False)[source]

Download document in specified format.

Automatically extracts if USPTO sends TAR/ZIP.

Parameters:
  • document (Document) – Document with document_formats list

  • format (str | DocumentMimeType) – Which format (PDF, XML, MS_WORD). Can be string or DocumentMimeType enum. Defaults to PDF.

  • destination (str | None) – Directory to save to (default: current directory)

  • file_name (str | None) – Override filename (default: from Content-Disposition)

  • overwrite (bool) – Overwrite existing file

Return type:

str

Returns:

Path to downloaded file (extracted if was in archive)

Raises:

FormatNotAvailableError – If format not available for this document. The exception includes requested_format, available_formats, and document attributes for programmatic error handling.

Example

>>> docs = client.get_application_documents("19312841", document_codes=["CTNF"])
>>> path = client.download_document(docs[0], format="XML")
>>> # Or using enum:
>>> path = client.download_document(docs[0], format=DocumentMimeType.XML)
download_publication(printed_metadata, destination=None, file_name=None, overwrite=False)[source]

Download a publication XML file (grant or pre-grant publication).

This method downloads publication XML files from PrintedMetaData objects, such as grant documents or pre-grant publications (pgpub). Auto-extracts if the server sends a TAR/ZIP archive.

Parameters:
  • printed_metadata (PrintedMetaData) – PrintedMetaData object containing the publication download URL and filename information. Typically obtained from get_application_associated_documents() or from PatentFileWrapper’s grant_document_meta_data or pg_publication_document_meta_data.

  • destination (str | None) – Optional directory path where the file should be saved. If not provided, saves to the current directory. The directory will be created if it doesn’t exist.

  • file_name (str | None) – Optional custom filename. If not provided, uses the xml_file_name from the metadata (e.g., “18915708_12307527.xml”).

  • overwrite (bool) – Whether to overwrite an existing file at the destination. Default is False, which raises FileExistsError if file exists.

Returns:

Absolute path to the downloaded publication file (extracted if was in archive).

Return type:

str

Raises:
  • ValueError – If printed_metadata has no file_location_uri (download URL).

  • FileExistsError – If the file already exists and overwrite=False.

Examples

Download grant XML to a specific directory (auto-filename):

>>> response = client.get_application_by_number("18/915,708")
>>> ifw = response
>>> grant_metadata = ifw.grant_document_meta_data
>>> path = client.download_publication(grant_metadata, destination="./downloads")
>>> print(path)
'./downloads/18915708_12307527.xml'

Download pgpub XML with custom filename:

>>> pgpub_metadata = ifw.pg_publication_document_meta_data
>>> path = client.download_publication(
...     pgpub_metadata,
...     file_name="my_publication.xml",
...     destination="./downloads"
... )
>>> print(path)
'./downloads/my_publication.xml'

Download to current directory:

>>> path = client.download_publication(grant_metadata)
>>> print(path)
'./18915708_12307527.xml'
get_IFW(*, application_number=None, publication_number=None, patent_number=None, PCT_app_number=None, PCT_pub_number=None, destination=None, overwrite=False, as_zip=True)[source]

Retrieve IFW metadata and download all prosecution documents.

Combines get_IFW_metadata with a bulk download of all available prosecution history documents (PDF preferred, DOCX fallback). Documents with no downloadable format (e.g., NPL references) are silently skipped. A warning is issued only if a document has a download URL but the download itself fails.

Parameters:
  • application_number (str | None) – USPTO application number (e.g., “16123456”).

  • publication_number (str | None) – USPTO pre-grant publication number.

  • patent_number (str | None) – USPTO patent number.

  • PCT_app_number (str | None) – PCT application number.

  • PCT_pub_number (str | None) – PCT publication number.

  • destination (str | None) – Directory for output. Defaults to current directory.

  • overwrite (bool) – Whether to overwrite an existing output. Default False.

  • as_zip (bool) – If True (default), package all downloads into a ZIP archive at {destination}/{app_no}_ifw.zip. If False, download files directly into {destination}/{app_no}_ifw/.

Return type:

IFWResult | None

Returns:

IFWResult with the PatentFileWrapper, the output path, and a mapping of document_identifier to filename for each downloaded document. Returns None if no application was found.

Raises:

FileExistsError – If the output path already exists and overwrite=False.

get_IFW_metadata(*, application_number=None, publication_number=None, patent_number=None, PCT_app_number=None, PCT_pub_number=None)[source]

Retrieve complete patent file wrapper data using common identifiers.

This utility fetches the PatentFileWrapper, which contains comprehensive IFW metadata, application details, and more. Provide only one identifier if possible. If multiple are given, they are processed in the order listed in the arguments, and the first successful match is returned.

Parameters:
  • application_number (str | None) – USPTO application number (e.g., “16123456”). Checked first (direct lookup).

  • patent_number (str | None) – USPTO patent number (e.g., “11000000”). Checked second (uses search).

  • publication_number (str | None) – USPTO pre-grant publication number (e.g., “20230123456”). Checked third (uses search).

  • PCT_app_number (str | None) – PCT application number. Checked fourth (direct lookup, treated as USPTO app#).

  • PCT_pub_number (str | None) – PCT publication number (e.g., “2023012345”). Checked fifth (uses search).

Returns:

A PatentFileWrapper object with

comprehensive data if found using one of the identifiers, otherwise None.

Return type:

PatentFileWrapper | None

get_application_adjustment(application_number)[source]

Retrieve patent term adjustment (PTA) data for a specific application.

This method fetches the PatentTermAdjustmentData component from the full patent file wrapper. This data includes details on various delay quantities (e.g., A, B, C delays, applicant delays), the total calculated adjustment, and a history of PTA events that influenced the term.

Parameters:

application_number (str) – The USPTO application number for which PTA data is being requested (e.g., “16123456”).

Returns:

A PatentTermAdjustmentData

object containing the PTA details if the application is found and has such data. Returns None if the application cannot be found or if PTA data is not available in the response.

Return type:

PatentTermAdjustmentData | None

get_application_assignment(application_number)[source]

Retrieve a list of patent assignments for a specific application.

This method fetches the assignment_bag from the patent file wrapper, which contains a list of Assignment objects. Each Assignment object details an assignment including information such as reel and frame numbers, recording dates, conveyance text, and details about the assignors and assignees.

Parameters:

application_number (str) – The USPTO application number for which assignment data is being requested (e.g., “16123456”).

Returns:

A list of Assignment objects, each

representing a recorded assignment for the application. Returns None if the application cannot be found, or if no assignment data is available in the response. An empty list may be returned if the application is found but has no recorded assignments.

Return type:

list[Assignment] | None

get_application_associated_documents(application_number)[source]

Retrieve metadata for Pre-Grant Publication and Grant documents.

This method fetches metadata specifically for published documents associated with the patent application, such as Pre-Grant Publications (PGPUBs) and granted patent documents. It does not retrieve the prosecution history documents (see get_application_documents for that). The result is a PrintedPublication object, which holds PrintedMetaData including file URIs and names. Download with download_archive.

Parameters:

application_number (str) – The USPTO application number for which associated PGPUB/Grant document metadata is being requested (e.g., “16123456”).

Returns:

A PrintedPublication object

containing PrintedMetaData for the Pre-Grant Publication and/or the Grant document, if available. Returns None if the application cannot be found or if no such associated document metadata is available. The fields within the returned object (pgpub_document_meta_data, grant_document_meta_data) may themselves be None if a particular type of document (e.g., PGPUB) does not exist for the application.

Return type:

PrintedPublication | None

get_application_attorney(application_number)[source]

Retrieve data for the attorney(s) of record for a specific application.

This method fetches the RecordAttorney object associated with the patent application. This object contains details about the attorney(s) of record, including customer number correspondence data, power of attorney information, and a list of listed attorneys.

Parameters:

application_number (str) – The USPTO application number for which attorney data is being requested (e.g., “16123456”).

Returns:

A RecordAttorney object with details

about the attorney(s) of record if the application is found and such data exists. Returns None if the application cannot be found or if no attorney data is available in the response.

Return type:

RecordAttorney | None

get_application_by_number(application_number)[source]

Retrieve the full details for a specific patent application by its number.

This method fetches comprehensive information for a single patent application identified by its unique application number.

Parameters:

application_number (str) – The USPTO application number for the patent application (e.g., “16123456” or “18/915,708”). The application number will be automatically sanitized to remove commas and spaces.

Returns:

A PatentFileWrapper object representing

the complete file wrapper for the application if found. This object contains all data sections related to the application, such as metadata, addresses, assignments, attorney/agent data, continuity data, PTA/PTE data, transactions, and associated documents. Returns None if the application cannot be found or if the response does not contain the expected data.

Return type:

PatentFileWrapper | None

get_application_continuity(application_number)[source]

Retrieve continuity data (parent/child applications) for a specific application.

This method fetches the lineage of the specified application, returning an ApplicationContinuityData object. This object consolidates lists of ParentContinuity (applications to which the current one claims priority) and ChildContinuity (applications claiming priority to the current one) objects, each detailing the related application’s key identifiers and status.

Parameters:

application_number (str) – The USPTO application number for which continuity data is being requested (e.g., “16123456”).

Returns:

An ApplicationContinuityData

object containing lists of parent and child continuity relationships. Returns None if the application cannot be found or if the underlying data to construct continuity is not available. The lists within the returned object may be empty if no parent or child continuity links exist.

Return type:

ApplicationContinuityData | None

get_application_documents(application_number, document_codes=None, official_date_from=None, official_date_to=None)[source]

Retrieve metadata for documents associated with a specific application.

This method fetches a collection of document metadata related to the given patent application. The result is a DocumentBag object, which is an iterable collection of Document instances. Each Document object contains metadata such as its identifier, official date, document code and description, direction (incoming/outgoing), and available download formats.

Parameters:
  • application_number (str) – The USPTO application number for which document metadata is being requested (e.g., “16123456”).

  • document_codes (list[str] | None) – Filter by specific document type codes. If provided, only documents with these codes will be returned. Examples: [‘ABST’, ‘CLM’, ‘SPEC’, ‘DRWD’].

  • official_date_from (str | None) – Filter documents from this date (inclusive). Date format: YYYY-MM-DD (e.g., “2020-01-15”).

  • official_date_to (str | None) – Filter documents to this date (inclusive). Date format: YYYY-MM-DD (e.g., “2023-12-31”).

Returns:

A DocumentBag object containing metadata for all

publicly available documents associated with the application that match the provided filters. The bag will be empty if no documents are found or if the API response indicates no documents. It does not return None for “not found” cases; an empty collection is returned instead.

Return type:

DocumentBag

get_application_foreign_priority(application_number)[source]

Retrieve a list of foreign priority claims for a specific application.

This method fetches the foreign_priority_bag from the patent file wrapper. This bag contains a list of ForeignPriority objects, each representing a claim to a foreign patent application’s priority date. Details include the IP office name, filing date, and application number of the foreign priority application.

Parameters:

application_number (str) – The USPTO application number for which foreign priority data is being requested (e.g., “16123456”).

Returns:

A list of ForeignPriority objects,

each detailing a claimed foreign priority. Returns None if the application cannot be found or if no foreign priority data is available. An empty list may be returned if the application is found but has no foreign priority claims.

Return type:

list[ForeignPriority] | None

get_application_metadata(application_number)[source]

Retrieve key metadata for a specific patent application.

This method fetches the ApplicationMetaData component from the full patent file wrapper. The metadata includes a wide range of information such as application status, important dates (filing, grant, publication), applicant and inventor details, classification data, and other core identifying information for the application.

Parameters:

application_number (str) – The USPTO application number for which metadata is being requested (e.g., “16123456” or “18/915,708”). The application number will be automatically sanitized.

Returns:

An ApplicationMetaData object

containing the core details of the patent application if found. Returns None if the application cannot be found or if metadata is not available in the response.

Return type:

ApplicationMetaData | None

get_application_transactions(application_number)[source]

Retrieve the transaction history (events) for a specific application.

This method fetches the event_data_bag from the patent file wrapper. This bag contains a list of EventData objects, each representing a single recorded event in the prosecution history of the patent application. Events include details like an event code, a textual description, and the date the event was recorded.

Parameters:

application_number (str) – The USPTO application number for which transaction history is being requested (e.g., “16123456”).

Returns:

A list of EventData objects, each

detailing a transaction or event in the application’s history. Returns None if the application cannot be found or if no transaction data is available. An empty list may be returned if the application is found but has no recorded transaction events.

Return type:

list[EventData] | None

get_patent(patent_number)[source]

Retrieve application metadata by patent number.

Searches the USPTO API for the given patent number and returns the corresponding PatentFileWrapper. This is a lightweight lookup that does not fetch the full document bag.

Parameters:

patent_number (str) – The USPTO patent number (e.g., “11000000”).

Returns:

The matching patent file wrapper,

or None if not found.

Return type:

PatentFileWrapper | None

get_pct(pct_number)[source]

Retrieve application metadata by PCT number.

Accepts both PCT application numbers and PCT publication numbers. The format is auto-detected:

  • PCT application numbers (starting with “PCT”) are resolved via direct lookup using get_application_by_number.

  • PCT publication numbers (e.g., “WO2024012345A1”) are resolved via search.

Parameters:

pct_number (str) – A PCT application number (e.g., “PCT/US2024/012345”) or PCT publication number (e.g., “WO2024012345A1”).

Returns:

The matching patent file wrapper,

or None if not found.

Return type:

PatentFileWrapper | None

get_publication(publication_number)[source]

Retrieve application metadata by publication number.

Searches the USPTO API for the given pre-grant publication number and returns the corresponding PatentFileWrapper. This is a lightweight lookup that does not fetch the full document bag.

Parameters:

publication_number (str) – The USPTO publication number (e.g., “20230123456”).

Returns:

The matching patent file wrapper,

or None if not found.

Return type:

PatentFileWrapper | None

get_search_results(query=None, sort=None, offset=0, limit=25, fields_param=None, filters_param=None, range_filters_param=None, post_body=None, application_number_q=None, patent_number_q=None, inventor_name_q=None, applicant_name_q=None, assignee_name_q=None, filing_date_from_q=None, filing_date_to_q=None, grant_date_from_q=None, grant_date_to_q=None, classification_q=None, additional_query_params=None)[source]

Fetch a dataset of patent applications based on search criteria, always requesting JSON format.

For GET, parameters align with OpenAPI for /api/v1/patent/applications/search/download. For POST, post_body should conform to PatentDownloadRequest schema.

Return type:

list[ApplicationMetaData]

get_status_codes(params=None)[source]

Retrieve USPTO patent application status codes and their descriptions.

This method fetches a list of defined USPTO patent application status codes (e.g., codes for “Pending,” “Abandoned,” “Issued”) using a GET request. The request can be customized with query parameters to filter or paginate the results if supported by the API endpoint.

Parameters:

params (dict[str, Any] | None) – A dictionary of query parameters to be sent with the GET request. These parameters can be used to filter or control the output of the status codes list. Defaults to None, which typically retrieves all available status codes or the API’s default set.

Returns:

An object containing a count of matching

status codes, a StatusCodeCollection of the StatusCode objects (code and description), and a request identifier.

Return type:

StatusCodeSearchResponse

paginate_applications(post_body=None, **kwargs)[source]

Provide an iterator to easily paginate through patent application search results.

This method simplifies the process of fetching all patent applications that match a given search query by automatically handling pagination. Supports both GET and POST requests.

For GET requests, provide search parameters as keyword arguments. For POST requests, provide the search criteria in post_body.

The offset and limit parameters are managed by the pagination logic; setting them directly in kwargs or post_body might lead to unexpected behavior.

Parameters:
  • post_body (dict[str, Any] | None) – Optional POST body for complex search queries. If provided, performs POST-based pagination.

  • **kwargs (Any) – Keyword arguments for GET-based pagination or additional query parameters for POST requests.

Returns:

An iterator that yields PatentFileWrapper

objects, allowing iteration over all matching patent applications across multiple pages of results.

Return type:

Iterator[PatentFileWrapper]

Examples

# GET-based pagination for wrapper in client.paginate_applications(

query=”applicationNumberText:16*”, limit=50

):

print(wrapper.application_number_text)

# POST-based pagination for wrapper in client.paginate_applications(

post_body={

“q”: “applicationNumberText:16*”, “facets”: “true”, “fields”: “applicationNumberText,applicationMetaData”

}

):

print(wrapper.application_number_text)

sanitize_application_number(input_number)[source]

Sanitize and validate a USPTO application number.

Application numbers are either: - 8 digits (e.g., “16123456”) - Series code format: 2 digits + “/” + 6 digits (e.g., “08/123456”) - PCT format: “PCT/US2024/012345” → “PCTUS2412345”

This method removes common separators (commas, spaces) while preserving the “/” in series code format.

Parameters:

input_number (str) – Raw application number input. May include commas, spaces, or other formatting.

Returns:

Sanitized application number (either “NNNNNNNN” or “NN/NNNNNN”).

Return type:

str

Raises:

ValueError – If the format is invalid.

Examples

>>> client.sanitize_application_number("16123456")
"16123456"
>>> client.sanitize_application_number("16,123,456")
"16123456"
>>> client.sanitize_application_number("08/123456")
"08/123456"
>>> client.sanitize_application_number("08/123,456")
"08/123456"
search_applications(query=None, sort=None, offset=0, limit=25, facets=None, fields=None, filters=None, range_filters=None, post_body=None, application_number_q=None, patent_number_q=None, inventor_name_q=None, applicant_name_q=None, assignee_name_q=None, filing_date_from_q=None, filing_date_to_q=None, grant_date_from_q=None, grant_date_to_q=None, classification_q=None, earliestPublicationNumber_q=None, pctPublicationNumber_q=None, additional_query_params=None)[source]

Search for patent applications.

Can perform a GET request based on OpenAPI query parameters or a POST request if post_body is specified.

Return type:

PatentDataResponse

search_status_codes(search_request)[source]

Search USPTO patent application status codes using POST criteria.

Performs targeted searches for USPTO patent application status codes (e.g., for “Pending,” “Abandoned,” “Issued”) by sending a POST request with a JSON body containing the search_request criteria. This method is suited for more complex queries than the GET-based get_status_codes.

Parameters:

search_request (dict[str, Any]) – A dictionary with search criteria, sent as the JSON POST body. The structure must conform to USPTO API requirements for this endpoint (e.g., for searching by code or description keywords).

Returns:

An object containing a count of matching

status codes, a StatusCodeCollection of the StatusCode objects (code and description), and a request identifier.

Return type:

StatusCodeSearchResponse

stream_document(document, format=DocumentMimeType.PDF)[source]

Stream a document in the specified format without saving to disk.

Returns a streaming requests.Response. The caller is responsible for consuming and closing it — use as a context manager or call response.close() when done.

Parameters:
  • document (Document) – Document with document_formats list.

  • format (str | DocumentMimeType) – Which format (PDF, XML, MS_WORD). Can be string or DocumentMimeType enum. Defaults to PDF.

Return type:

Response

Returns:

Streaming requests.Response object.

Raises:

Example

>>> docs = client.get_application_documents("19312841", document_codes=["CTNF"])
>>> with client.stream_document(docs[0]) as response:
...     for chunk in response.iter_content(chunk_size=8192):
...         process(chunk)