Skip to main content
Version: v2.0

aixplain Web Search

aixplain Web Search is a hosted utility tool that queries a web search engine (Google, Bing, or DuckDuckGo), scrapes clean, tag-free content from the top N results in parallel, and optionally synthesizes a cited answer with the LLM you choose. It replaces the usual two-tool setup (a search tool plus a separate scraper such as SerpAPI + Firecrawl) with a single tool call. View the asset on the aiXplain Marketplace.

  • Asset path: aixplain/aixplain-web-search/aixplain
  • Tool ID: 6a0c9044beac0e7cdc60122a
Use this instead of a search tool + scraper

Returning the top 5 results the traditional way takes about 7 LLM calls — one to run the search, five to scrape each URL, and one to write the answer. This tool does the search and scrapes the top results in parallel in one call, plus one optional call if you want a cited answer — roughly a 71% reduction in LLM calls, with cleaner, smaller outputs than HTML scrapers.


Setup

from aixplain import Aixplain

aix = Aixplain(api_key="<AIXPLAIN_API_KEY>")

Get the tool

web_search = aix.Tool.get("6a0c9044beac0e7cdc60122a")
print(web_search.name)
Show output

Action and inputs

One action: search. Pass inputs in the data payload.

InputRequiredDefaultControls
queryYesThe search query
num_resultsNo5How many top results to fetch and scrape
word_limitNo100Max words kept per scraped page before truncation (token control)
timeoutNoPer-URL scrape timeout (s). Returns pages that finished within the window instead of waiting on slow ones
result_typeNoansweranswer = LLM-synthesized answer with citations; raw = URLs + scraped content, no LLM step
search_asset_idNoplatform defaultUnderlying search engine/model (Google, Bing, DuckDuckGo)
llmNoLLM asset used to synthesize the answer (answer mode)
max_answer_wordsNoCaps the synthesized answer length (answer mode)

Answer mode (cited LLM overview)

result = web_search.run(action="search", data={
"query": "what is retrieval augmented generation",
"num_results": 3, "word_limit": 80, "max_answer_words": 80,
})
print(result.data)
Show output

Raw mode (URLs + clean scraped content)

Set result_type="raw" to skip the LLM step and get the scraped pages directly — one record per result with title, URL, snippet, and cleaned page text.

result = web_search.run(action="search", data={
"query": "what is retrieval augmented generation",
"result_type": "raw", "num_results": 3, "word_limit": 60,
})
print(result.data)
Show output

Choosing your setup (how an agent uses it)

Inside an agent, the mode you pick changes how the Researcher gathers evidence per sub-question:

  • Search tool + scraper (e.g. Tavily + Firecrawl): the agent's LLM (1) calls search → gets URLs + snippets, (2) decides which 2–3 URLs look best, (3) calls the scraper on each → clean markdown, (4) extracts quotes from that text.
  • Web Search, answer mode (default): a single Web Search call → search + scrape + summarize top-5 → answer + references. No separate scraper.
  • Web Search, raw mode (result_type="raw"): a single call → search + parallel scrape of the top results → clean page text (no summary). The agent reads the raw text and extracts/cites itself.

The behavioral axes that matter

tool calls / sub-qwho scrapeswho summarizesfailure points
Search + scraper1 search + several scrapesthe scraperthe agent (reads raw)many (two tools, chained)
Web Search (answer)1Web Search Toolthe tool's LLMfew
Web Search (raw)1Web Search Tool (parallel)the agent (reads raw)few

Pick: answer for the fewest moving parts plus a ready cited summary; raw when the agent should read and cite sources itself. Keep num_results / word_limit modest in raw mode so large payloads don't bloat the agent's context.


Use with an Agent

web_search = aix.Tool.get("6a0c9044beac0e7cdc60122a")
web_search.allowed_actions = ["search"]

agent = aix.Agent(
name="Web Researcher",
description="Answers questions using live web search with cited sources",
instructions=(
"For any question that needs current information, call the Web Search Tool. "
"Use result_type 'answer' for a cited summary, or 'raw' when you need to read and "
"quote the source pages yourself. Raise num_results for broader coverage."
),
tools=[web_search],
)
agent.save()
note

Each agent step that calls this tool replaces a search call plus several scrape calls, so agents run with fewer LLM calls and smaller tool outputs.


Notes

  • Accuracy is comparable to a search-and-scrape workflow — content comes directly from the engine's results and the relevant page text is extracted.
  • Token efficiency — scraped content is tag-free and size-capped via word_limit.
  • Citations — in answer mode the answer is returned with the source URLs that support it.
  • It is a built-in utility — no connect/OAuth step; fetch it with aix.Tool.get(...).