Software Engineer at Timescale
Socials:
- github.com/Askir
- linkedin.com/in/jaschabeste
The LLM understands arbitrary web pages and finds the correct links to click. Not for one specific page but for ANY company name that you give it.
It will always come back with a list of technologies used if available on the companies page. Regardless of how that page is structured. That level of generic understanding is simply not solveable with just some regex and curls.
Oh the tools are hand coded (or rather built with Claude Code) but the agent can call them to control the browser.
Imagine a prompt like this:
You are a research agent your goal is to figure out this companies tech stack: - Company Name
Your available tools are: - navigate_to_url: use this to load a page e.g. use google or bing to search for the company site It will return the page content as well as a list of available links - click_link: Use this to click on a specific link on the currently open page. It will also return the current page content and any available links
A good strategy is usually to go on the companies careers page and search for technical roles.
This is a short form of what is actually written there but we use this to score leads as we are built on postgres and AWS and if a company is using those, these are very interesting relevancy signals for us.
Hmm my browser agents each have about 50-100 turns (takes roughly 3-5 minutes for each one) and one focused objective I make use of structured output to group all the info it found into a standardized format at the end.
I have 4 of those "research agents" with different prompts running after another and then I format the results into a nice slack message + Summarize and evaluate the results in one final call (with just the result jsons as input).
This works really well. We use it to score leads as for how promising they are to reach out to for us.
I have built a custom "deep research" internally that uses puppeteer to find business information, tech stack and other information about a company for our sales team.
My experience was that giving the LLM a very limited set of tools and no screenshots worked pretty damn well. Tbf for my use case I don't need more interactivity than navigate_to_url and click_link. Each tool returning a text version of the page and the clickable options as an array.
It is very capable of answering our basic questions. Although it is powered by gpt-5 not claude now.
We've deployed a few internal MCP servers (e.g. to access our slack messages, salesforce cases and other internal information)
MCP allows to both: - Mount these into your chatbot of choice - Use these in any automation (custom chatbot or other LLM flow) if your framework supports MCP
Tbf I am still under the impression that for the later use case you would rather use the (HTTP) API directly because MCP doesn't allow you to customize the tool descriptions and if you truly want to make your automation perform well you need to iterate on those in my experience.
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io