Show a human any photograph and they’ll able to predict what happens next with pretty decent accuracy. The woman riding her bike will keep on moving. The dog will catch the frisbee. The man is going to have a pratfall. And so on. It’s such a basic skill that we don’t consider the vast amount of information that is used to make these predictions — concerning gravity, inertia, the nature of pratfalls, etc. — and teaching computers to do the same is proving to be a key challenge in machine vision.
Researchers from MIT attempting to solve this problem have come up with some very impressive results, using specially trained neural networks to turn images into videos and getting the computer to essentially predict what happens next.