o1 Thinking In Chinese

2025-01-15 @ 11 AM - #ai, #openai, #slashdot

While I haven’t personally encountered it in a while, ChatGPT chats used to rarely autogenerate titles in Spanish instead of English. I assume that o1 thinking in Chinese occurs for a similar set of reasons, since both seem related to the multilingual datasets that the models are trained on. As an aside, I’m curious if Chinese makes up a significantly higher percentage of the training data for Chinese LLMs like DeepSeek, since both English and Chinese already make up the significant part of the corpuses for English models.