Skip to main content
← Back to Blog

Build a company research agent that replaces 12 open tabs

Colin Gillingham··5 min read
gtm-automationhubspotai-agentssales-automationclaude-agent-sdk

This post is part of the GTM Automation Playbook — a 13-part series on building AI-powered GTM agents with HubSpot.


Every sales rep I've watched does the same thing before a first call. They open LinkedIn, the company website, Crunchbase, maybe Google News, maybe G2. They scan, copy bits into a doc or their head, then try to synthesize it into something useful. It takes 30-60 minutes per account. The output is inconsistent. And it lives nowhere permanent.

This is the first thing I'd automate on any GTM team. Not because it's the highest-leverage play, but because it's the most obvious waste of human time.

Why Claude Agent SDK and not a workflow tool

I use the Claude Agent SDK for this instead of n8n or Gumloop because company research is fundamentally an open-ended task. You don't know in advance how many searches you'll need or what follow-up questions the data will raise. A workflow tool forces you to pre-define the sequence. An agent decides what to research next based on what it's already found.

The Agent SDK gives Claude access to tools — web search, your CRM, whatever APIs you connect — and lets it run an autonomous loop: search, read results, decide what's missing, search again, synthesize, write to HubSpot. You define the tools and the goal. The agent handles orchestration.

Set up the HubSpot properties first

Before you build anything, create custom properties in HubSpot for the data your agent will write. Go to Settings > Properties > Company Properties, or use the Properties API (POST /crm/v3/properties/companies).

I create these: tech_stack (textarea), funding_stage (dropdown), icp_fit_score (number), research_summary (textarea), last_researched_date (date), key_contacts_summary (textarea). Use the companyinformation group. Both type and fieldType are required when creating via API — common pairs are string/textarea for long text and number/number for scores.

This step matters. If the agent has nowhere to write structured data, it'll dump everything into a note. Notes are searchable but not filterable or usable in workflows. Properties are.

Build the agent

Install the SDK: pip install claude-agent-sdk. You'll need an Anthropic API key and a HubSpot private app access token with CRM scopes.

The agent needs two custom tools: one to search HubSpot for existing company records (so you don't create duplicates), and one to create or update a company. Both hit HubSpot's Companies API (/crm/v3/objects/companies). Define them using the @tool decorator and wire them into an in-process MCP server with create_sdk_mcp_server().

from claude_agent_sdk import tool, create_sdk_mcp_server

@tool("hubspot_search_company",
      "Search HubSpot for a company by domain",
      {"domain": str})
async def hubspot_search_company(args):
    # POST /crm/v3/objects/companies/search
    # Filter: propertyName=domain, operator=EQ, value=args["domain"]
    ...

@tool("hubspot_upsert_company",
      "Create or update a company in HubSpot",
      {"company_id": str, "properties": dict})
async def hubspot_upsert_company(args):
    # If company_id: PATCH /crm/v3/objects/companies/{id}
    # Else: POST /crm/v3/objects/companies
    ...

server = create_sdk_mcp_server(
    name="company-research-tools",
    version="1.0.0",
    tools=[hubspot_search_company, hubspot_upsert_company]
)

Web search is a built-in capability — you don't need to implement a scraping tool. The agent will use it automatically when you include it in allowed_tools.

The system prompt is where you define what good research looks like. Mine tells the agent to gather firmographic data (industry, headcount, revenue, HQ), technographic data (tools and platforms), leadership info, recent news and funding, and competitive context. Then synthesize into a structured brief with an ICP fit assessment, pain point hypotheses, and talking points. Then write the structured fields to HubSpot and return the full brief.

options = ClaudeAgentOptions(
    system_prompt=SYSTEM_PROMPT,
    model="claude-sonnet-4-5-20250929",
    mcp_servers={"crm": server},
    allowed_tools=[
        "mcp__crm__hubspot_search_company",
        "mcp__crm__hubspot_upsert_company",
        "WebSearch"
    ]
)

Run it with asyncio.run(research_company("stripe.com")). The agent will execute 4-8 web searches, check HubSpot for an existing record, synthesize everything, and upsert the company with all custom properties populated.

What the output looks like

A good research brief from this agent includes: a one-paragraph company summary, headcount and revenue range, tech stack (specific tools, not categories), 3-5 key contacts with titles, recent news (funding, product launches, leadership changes), an ICP fit score with reasoning, and 2-3 talking points tied to something specific about the company.

The structured properties land in HubSpot where they're filterable and usable in workflows. The full brief goes into research_summary. The agent updates last_researched_date so you can trigger re-research on a schedule.

Practical considerations

Cost. Each research run uses roughly 10-15K input tokens and 2-3K output tokens across the agentic loop. At Sonnet pricing, that's under $0.10 per company. If you're researching 200 accounts a month, you're looking at $20.

Rate limits. HubSpot's API allows 100 requests per 10 seconds on private apps. The agent's natural pacing stays well within this. If you're running batch research, add a delay between companies.

Deduplication. Always search by domain before creating. The search-then-upsert pattern prevents duplicate records. HubSpot's own deduplication can catch some cases, but the agent should handle it explicitly.

Extending it. The obvious next steps are adding a tool to create associated contacts in HubSpot, connecting a Slack notification when research completes, and scheduling periodic re-research for high-value accounts. Each is just another @tool definition.

The reps who used to spend their mornings researching now spend them selling. The data quality in the CRM goes up because it's structured and consistent. And the research doesn't disappear when someone forgets to save their notes.

Colin Gillingham

Need a Fractional Head of AI?

I help companies build an AI operating system — shared context across teams, AI handling the repetitive work, and your people focused on what actually matters.

15+

Years in Tech

12+

AI Products Shipped

3

Fortune 500 Brands