Is there a rule-of-thumb for how to divide a dataset into training and validation sets? [closed]

IT Nursery

May 23, 2022

Is there a rule-of-thumb for how to best divide data into training and validation sets? Is an even 50/50 split advisable? Or are there clear advantages of having more training data relative to validation data (or vice versa)? Or is this choice pretty much application dependent?

I have been mostly using an 80% / 20% of training and validation data, respectively, but I chose this division without any principled reason. Can someone who is more experienced in machine learning advise me?

7 Answers
7

Tags: machine-learning

7 Answers 7

Leave a Reply Cancel reply

7 Answers
7