Bots are currently scraping the internet for LLM training data at unprecedented rates[1][2][3], driving up costs and destabilizing public-facing websites. I want to talk about how this has been particularly difficult for wikis, and has gotten much worse in the last few months.
If you don’t mind paying for it, I like Kagi. There are some people who don’t for reasons outside of its performance, so make sure you do your research if you care about those things.
For free, DDG isn’t bad, but there’s also options that can be self-hosted or you can find other people’s hosted instances: SearXNG and LibreY for example. (I’d link to some but I don’t want to accidentally get them hugged to death since they’re being run by small independent servers.) I don’t use any of the above free options so I can’t comment on which one’s better or not but at least I am aware they exist. I know SearXNG lets you also pick and choose which search engine services are used in your searches.