aixplain Web Search
aixplain Web Search is a hosted utility tool that queries a web search engine (Google, Bing, or DuckDuckGo), scrapes clean, tag-free content from the top N results in parallel, and optionally synthesizes a cited answer with the LLM you choose. It replaces the usual two-tool setup (a search tool plus a separate scraper such as SerpAPI + Firecrawl) with a single tool call. View the asset on the aiXplain Marketplace.
- Asset path:
aixplain/aixplain-web-search/aixplain - Tool ID:
6a0c9044beac0e7cdc60122a
Returning the top 5 results the traditional way takes about 7 LLM calls — one to run the search, five to scrape each URL, and one to write the answer. This tool does the search and scrapes the top results in parallel in one call, plus one optional call if you want a cited answer — roughly a 71% reduction in LLM calls, with cleaner, smaller outputs than HTML scrapers.
Setup
from aixplain import Aixplain
aix = Aixplain(api_key="<AIXPLAIN_API_KEY>")
Get the tool
web_search = aix.Tool.get("6a0c9044beac0e7cdc60122a")
print(web_search.name)
Action and inputs
One action: search. Pass inputs in the data payload.
| Input | Required | Default | Controls |
|---|---|---|---|
query | Yes | — | The search query |
num_results | No | 5 | How many top results to fetch and scrape |
word_limit | No | 100 | Max words kept per scraped page before truncation (token control) |
timeout | No | — | Per-URL scrape timeout (s). Returns pages that finished within the window instead of waiting on slow ones |
result_type | No | answer | answer = LLM-synthesized answer with citations; raw = URLs + scraped content, no LLM step |
search_asset_id | No | platform default | Underlying search engine/model (Google, Bing, DuckDuckGo) |
llm | No | — | LLM asset used to synthesize the answer (answer mode) |
max_answer_words | No | — | Caps the synthesized answer length (answer mode) |
Answer mode (cited LLM overview)
result = web_search.run(action="search", data={
"query": "what is retrieval augmented generation",
"num_results": 3, "word_limit": 80, "max_answer_words": 80,
})
print(result.data)
Raw mode (URLs + clean scraped content)
Set result_type="raw" to skip the LLM step and get the scraped pages directly — one record per result with title, URL, snippet, and cleaned page text.
result = web_search.run(action="search", data={
"query": "what is retrieval augmented generation",
"result_type": "raw", "num_results": 3, "word_limit": 60,
})
print(result.data)
Choosing your setup (how an agent uses it)
Inside an agent, the mode you pick changes how the Researcher gathers evidence per sub-question:
- Search tool + scraper (e.g. Tavily + Firecrawl): the agent's LLM (1) calls search → gets URLs + snippets, (2) decides which 2–3 URLs look best, (3) calls the scraper on each → clean markdown, (4) extracts quotes from that text.
- Web Search, answer mode (default): a single Web Search call → search + scrape + summarize top-5 → answer + references. No separate scraper.
- Web Search, raw mode (
result_type="raw"): a single call → search + parallel scrape of the top results → clean page text (no summary). The agent reads the raw text and extracts/cites itself.
The behavioral axes that matter
| tool calls / sub-q | who scrapes | who summarizes | failure points | |
|---|---|---|---|---|
| Search + scraper | 1 search + several scrapes | the scraper | the agent (reads raw) | many (two tools, chained) |
Web Search (answer) | 1 | Web Search Tool | the tool's LLM | few |
Web Search (raw) | 1 | Web Search Tool (parallel) | the agent (reads raw) | few |
Pick: answer for the fewest moving parts plus a ready cited summary; raw when the agent should read and cite sources itself. Keep num_results / word_limit modest in raw mode so large payloads don't bloat the agent's context.
Use with an Agent
web_search = aix.Tool.get("6a0c9044beac0e7cdc60122a")
web_search.allowed_actions = ["search"]
agent = aix.Agent(
name="Web Researcher",
description="Answers questions using live web search with cited sources",
instructions=(
"For any question that needs current information, call the Web Search Tool. "
"Use result_type 'answer' for a cited summary, or 'raw' when you need to read and "
"quote the source pages yourself. Raise num_results for broader coverage."
),
tools=[web_search],
)
agent.save()
Each agent step that calls this tool replaces a search call plus several scrape calls, so agents run with fewer LLM calls and smaller tool outputs.
Notes
- Accuracy is comparable to a search-and-scrape workflow — content comes directly from the engine's results and the relevant page text is extracted.
- Token efficiency — scraped content is tag-free and size-capped via
word_limit. - Citations — in
answermode the answer is returned with the source URLs that support it. - It is a built-in utility — no connect/OAuth step; fetch it with
aix.Tool.get(...).