Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What about for their LLM products? We know that OpenAi does not respect the robots.txt file




Google uses the same crawler and robots.txt file for training data.

It's actually a different crawler for training data: Googlebot-extended so you can exclude yourself from the training data though not the search summaries.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: