Interface SeedUrlConfiguration.Builder

All Superinterfaces:
Buildable, CopyableBuilder<SeedUrlConfiguration.Builder,SeedUrlConfiguration>, SdkBuilder<SeedUrlConfiguration.Builder,SeedUrlConfiguration>, SdkPojo
Enclosing class:
SeedUrlConfiguration

public static interface SeedUrlConfiguration.Builder extends SdkPojo, CopyableBuilder<SeedUrlConfiguration.Builder,SeedUrlConfiguration>
  • Method Details

    • seedUrls

      The list of seed or starting point URLs of the websites you want to crawl.

      The list can include a maximum of 100 seed URLs.

      Parameters:
      seedUrls - The list of seed or starting point URLs of the websites you want to crawl.

      The list can include a maximum of 100 seed URLs.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
    • seedUrls

      SeedUrlConfiguration.Builder seedUrls(String... seedUrls)

      The list of seed or starting point URLs of the websites you want to crawl.

      The list can include a maximum of 100 seed URLs.

      Parameters:
      seedUrls - The list of seed or starting point URLs of the websites you want to crawl.

      The list can include a maximum of 100 seed URLs.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
    • webCrawlerMode

      SeedUrlConfiguration.Builder webCrawlerMode(String webCrawlerMode)

      You can choose one of the following modes:

      • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

      • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

      • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

      The default mode is set to HOST_ONLY.

      Parameters:
      webCrawlerMode - You can choose one of the following modes:

      • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

      • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

      • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

      The default mode is set to HOST_ONLY.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
      See Also:
    • webCrawlerMode

      SeedUrlConfiguration.Builder webCrawlerMode(WebCrawlerMode webCrawlerMode)

      You can choose one of the following modes:

      • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

      • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

      • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

      The default mode is set to HOST_ONLY.

      Parameters:
      webCrawlerMode - You can choose one of the following modes:

      • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

      • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

      • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

      The default mode is set to HOST_ONLY.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
      See Also: