Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Amaculus/screaming-frog-api/llms.txt
Use this file to discover all available pages before exploring further.
Overview
ConfigPatches builds a patch payload for the sf-config-builder Java library. Use it to modify .seospiderconfig files without manually editing XML.
ConfigPatches
Mutable dataclass. All mutating methods returnself for chaining.
set()
Set a scalar config value by dotted path.
Dotted config path (e.g.
"mCrawlConfig.mMaxUrls").The value to set.
ConfigPatches
add_extraction()
Add a custom extraction rule.
Display name for the extraction.
XPath or CSS selector expression.
Selector type. One of
"XPATH" or "CSS_PATH".Extraction mode. One of
"TEXT", "HTML", "ATTRIBUTE".HTML attribute name. Required when
extract_mode="ATTRIBUTE".ConfigPatches
remove_extraction()
Remove an existing extraction rule by name.
Name of the extraction to remove.
ConfigPatches
clear_extractions()
Remove all custom extraction rules.
Returns ConfigPatches
add_custom_search()
Add a custom search rule.
A
CustomSearch instance.ConfigPatches
remove_custom_search()
Remove a custom search rule by name.
Name of the rule to remove.
ConfigPatches
clear_custom_searches()
Remove all custom search rules.
Returns ConfigPatches
add_custom_javascript()
Add a custom JavaScript rule.
A
CustomJavaScript instance.ConfigPatches
remove_custom_javascript()
Remove a custom JavaScript rule by name.
Name of the rule to remove.
ConfigPatches
clear_custom_javascript()
Remove all custom JavaScript rules.
Returns ConfigPatches
to_dict() → dict[str, Any]
Return the complete patch payload as a Python dict.
to_json(indent=2) → str
Return the complete patch payload as a JSON string.
JSON indentation level.
CustomSearch
Frozen dataclass. Represents a custom search rule forConfigPatches.
Fields
Display name for the custom search filter.
Search query string or regex pattern.
Match mode. Common values:
"CONTAINS", "REGEX", "DOES_NOT_CONTAIN", "BEGINS_WITH", "ENDS_WITH".Data type to search. Common values:
"TEXT", "REGEX".Search scope. Common values:
"HTML", "TEXT", "URL".Whether the search is case-sensitive.
Optional XPath expression to scope the search within the page.
to_op() → dict[str, Any]
Return the operation dict for the ConfigBuilder patch payload.
CustomJavaScript
Frozen dataclass. Represents a custom JavaScript extraction or rendering rule.Fields
Display name for the JavaScript rule.
JavaScript source code. The script should return the value to extract.
Rule type.
"EXTRACTION" extracts a value; other types are spider-version-dependent.Execution timeout in seconds.
Comma-separated MIME types the script applies to.
to_op() → dict[str, Any]
Return the operation dict for the ConfigBuilder patch payload.
write_seospider_config()
Apply a ConfigPatches object to a template .seospiderconfig file and write the result.
Path to the source
.seospiderconfig file to use as a template.Path to write the patched
.seospiderconfig file.Patch payload. Either a
ConfigPatches instance or a raw dict.Path to the
sf-config-builder JAR or install directory. Required only when the library cannot be located automatically.Path — the path to the written output file.
get_export_profile()
Load a named export profile from the bundled profile lists.
Profile name. Currently only
"kitchen_sink" is supported.ExportProfile
ExportProfile fields
Ordered list of export tab names (e.g.
"Internal:All", "Page Titles:Missing").Ordered list of bulk export names.