The large language model (LLM) trained using the dataset created by the Language Information Access Technology Team’s “Project of Development of Japanese Instruction data for LLM” has been publicly released by Stockmark Corporation.
The “Project of Development of Japanese Instruction data for LLM” is a project aimed at producing a large-scale, high-quality “instruction data,” crucial for the development of LLM. The project involves collaborative research with multiple companies. Stockmark Corporation, one of the collaborating companies, created and evaluated a model trained on the dataset of this project using their developed 13-billion-parameter LLM, “Stockmark-13b.” The evaluation showed that the performance of the model trained on this project’s dataset is higher than using existing datasets. Consequently, Stockmark Corporation has made this model publicly available.
For more details, please refer to Stockmark Corporation’s website.