Suggestion: publishing (parts of the) training data
#5
by
FlipTip
- opened
Hi IBM Team,
Thanks for the permissive license.
Are there any plans to release the training data, or parts of it, to help the community gain deeper insights into the model? Alternatively, sharing the synthetic data generation pipeline would also go a long way towards better understanding.
Community-driven open-source AI thrives on transparency; it accelerates collaborative research.
Thanks again for your contributions to the AI research community.
Hi @FlipTip , thanks for bringing up the topic of data transparency. A detailed description of the training data will be released in the final whitepaper when the full set of 4.0 models is launched, so stay tuned!