Ask HN: How do you search the web programmatically these days?

For the first time in a long time, I need to query a search engine programmatically, and found that most of them block the use of curl, etc.

So, my question is simple: how do you solve the problem? I've tried searxng with mediocre success, but it seems a bit heavy to have to be running a complete separate service for this one thing that I only need every once in a while. I haven't tried using a service that requires an API key, simply because I'm not sure which direction to go or who to go with.

Just thought I would ask here first.

4 points | by coreyp_1 20 hours ago

5 comments

  • davidsojevic 2 hours ago
    I work at SerpApi [0], and we offer a free tier that may serve your needs if you're just looking to do programmatic searches periodically.

    Much of the reason people go with a service like ours is because of the difficulty with rolling your own reliable solution. Happy to answer any questions you might have as well!

    [0]: https://serpapi.com/

  • dserban 15 hours ago
    https://pypi.org/project/ddgs/

    (Assuming you prefer Python.)

  • pwg 20 hours ago
    > and found that most of them block the use of curl

    Try again, but have curl provide a user agent string from one of the real browsers. You'll likely find that the request goes through.

  • raw_anon_1111 11 hours ago
    Can’t speak for search engines specifically. But I recently had to do a project which required me to crawl the customer’s large site and index it into a vector search for RAG for a call center.

    My first attempt was to use crawl it just by doing GET requests (ie same thing as using curl). That got me nowhere. I had to use headless Chrome and Playwright.

    Do any modern websites work with just curl even if they don’t block it - ie without being able to run JS?

  • lucas0647 20 hours ago
    [dead]