【Released】Project of Development of Japanese Instruction data for LLM

November 14, 2023 13:13

The large language model (LLM) trained using the dataset created by the Language Information Access Technology Team’s “Project of Development of Japanese Instruction data for LLM” has been publicly released by Stockmark Corporation.

The “Project of Development of Japanese Instruction data for LLM” is a project aimed at producing a large-scale, high-quality “instruction data,” crucial for the development of LLM. The project involves collaborative research with multiple companies. Stockmark Corporation, one of the collaborating companies, created and evaluated a model trained on the dataset of this project using their developed 13-billion-parameter LLM, “Stockmark-13b.” The evaluation showed that the performance of the model trained on this project’s dataset is higher than using existing datasets. Consequently, Stockmark Corporation has made this model publicly available.

For more details, please refer to Stockmark Corporation’s website.

Related Laboratories

last updated on June 19, 2025 14:23Laboratory

Language Information Access Technology Team (2017/5--2025/3)

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
						1st
2nd	3rd	4th	5th	6th	7th	8th
9th	Link to the event page for the 10th	Link to the event page for the 11th	12th	13th	14th	15th
16th	Link to the event page for the 17th	18th	Link to the event page for the 19th	20th	21th	22th
23th	24th	25th	Link to the event page for the 26th	Link to the event page for the 27th	28th	29th
30th

Center for Advanced Intelligence Project

News