Facebook has announced that it is releasing three of its main image identification algorithms to the public. It’s not the first time Facebook has opened its research to the public, and it likely won’t be the last. In this particular instance, Facebook say that they hope the work will “rapidly advance the field of machine vision”.
Such technology has already come a long way in just the last few years. It’s a bit like what you see on Google when you search by uploading a image. It makes an attempt to identify the person, place, or object in the image, and offer similar or related results. It’s also similar to the technology coming in the iOS 10 update to automatically categorise your photos.
We’re making the code for DeepMask+SharpMask as well as MultiPathNet — along with our research papers and demos related to them — open and accessible to all, with the hope that they’ll help rapidly advance the field of machine vision.
Such functionality is far more widespread than many of us think. With over 300 million uploads per day, detecting what’s in a image makes life easier for the system and us. Facebook has had facial recognition, for example, for a long time now. You upload a photo of a group of people, it recognises a “friend” and suggests you tag them.
Running facial recognition on every landscape, and cat photo uploaded to Facebook, however, would be a huge waste of resources. This is where algorithms like those now made public step in. They detect objects in your scene, where they are and then how many of them are in your scene. If there’s no person in the shot, it doesn’t waste time with facial recognition. This is why faces detected in coffee cup foam and clouds isn’t happening as often as it used to.
The code released now is basically a two part process. One identifies where objects are in the scene and how many of them there are. This is the DeepMask+SharpMask bit. Deepmask figures out the rough outlines of objects. SharpMask refines them. MultiPathNet then attempts to identify exactly what each object is. Is it a person? a dog? a donut? a giraffe?
Facebook’s next challenge is to apply the techniques to video. With objects that are moving and changing over time, as well as interacting with each other, it adds a whole new level of complexity.
The idea behind the technology is fascinating, and certainly holds a lot of potential for future application. Seeing what the general coding public makes of it will be equally as interesting.
Will you be playing with the code? Even if you’re not, what ideas can you come up with where this type of image recognition would be useful? Let us know your thoughts in the comments.
[via Popular Photography]