Image Recognition

Analysis Neural networks trained for object recognition tend to identify stuff based on their texture rather than shape, according to this latest research.

That means take away or distort the texture of something, and the wheels fall off the software.

Artificially intelligence may suck at, for instance, reading and writing, but it can be pretty good at recognizing things in images.

The latest explosion of excitement around neural-network-based computer vision was sparked in 2012 when the ImageNet Large Scale Visual Recognition Challenge, a competition pitting various image recognition systems against each other, was won by a convolutional neural network (CNN) dubbed AlexNet.

After this, tons of new image-scrutinizing CNN architectures came flooding in, and by 2017 most of them had an accuracy of over 95 per cent in the competition. If you showed them a photo, they would be able to confidently figure out what object or creature is in the snap. Now, it’s easy for developers and companies to just use off-the-shelf models trained on the ImageNet dataset to solve whatever image recognition problem they have, whether it's figuring out which species of animals are in a picture, or identifying items of clothing in a shot.

However, CNNs are also easily fooled by adversarial inputs. Change a small block of pixels in a photograph, and the software will fail to recognize an object correctly. What was a banana now looks like a toaster to the AI just by tweaking some colors. Heck, even a turtle can be mistaken for a gun.

And why is that? Could it be that machine-learning software focuses too much on texture, allowing changes in patterns in the image to hoodwink the classifier software?

Never mind the image, feel the texture

A paper submitted to this year’s International Conference on Learning Representations (ICLR) may explain why. Researchers from the University of Tübingen in Germany found that CNNs trained on ImageNet identify objects by their texture rather than shape.

They devised a series of simple tests to study how humans and machines understand visual abstracts. In the computer corner, four CNN models: AlexNet, VGG-16, GoogLeNet, and ResNet-50. In the fleshbag corner, 97 people. Everyone, living and electronic, was asked to identify the objects and animals shown in a series of images.

Crucially, the images were distorted in different ways to test each viewer's ability to truly comprehend what they were seeing: the pictures were presented as grayscale; with the object as a black silhouette against a white background; just the outline of the object; just a close-up of the texture of an object; with a distorted texture laid over the object; and just as normal.


An example of an image being distorted in different ways and the accuracy of the neural networks and humans in analyzing it. Source: Geirhos et al

The results showed that almost all the images that retained the objects' shape and texture were recognized correctly by humans and the neural networks. But when the test involved changing or removing the texture of the objects, the machines fared much worse. The software couldn't work with the shape of stuff alone.

(It's not entirely clear if the humans in the test were able to figure out what an object was from an earlier image. For example, if someone was shown a grayscale snap of a cat and then an outline of said cat, they could work out it was the cat again, whereas the neural networks do not retain state during inference. If so, this would be an advantage to humans in what was not supposed to be a memory test. However, it doesn't change the fact the AI couldn't deal with shapes alone.)


AI systems fail to correctly identify a picture of a cat if it is given the texture of an elephant. Source: Geirhos et al

“These experiments provide behavioral evidence in favor of the texture hypothesis: a cat with an elephant texture is an elephant to CNNs, and still a cat to humans,” the paper stated.

Neural networks are lazy learners

It appears humans can recognize objects by their overall shape, while machines consider smaller details, particularly textures. When asked to identify objects with an incorrect texture, such as the cat-with-elephant-skin example, the 97 human participants were accurate 95.9 per cent of the time on average, but the neural networks only scored between 17.2 per cent to 42.9 per cent.

“On a very fundamental level, our work highlights how far current CNNs are from learning the 'true' structure of the world,” Robert Geirhos, coauthor of the paper and a PhD student at the university, explained to The Register.

“They learn the easiest associations possible, and in many cases this means associating small texture-like bits and pieces of an image with a class label, rather than learning how objects [are typically shaped]. And I think adversarial examples are clearly pointing to the same problem – current CNNs don't learn the 'true' structure of the world.”

The problem may lie in the dataset. ImageNet contains over 14 million images of objects split across many categories, and yet it's not enough – there are not enough angles and other insights, it seems. Software trained from this information can't understand how stuff is actually formed, shaped, and proportioned.

The algorithms can tell butterfly species from the patterns on the creatures' wings, but take away that detail, and the code seemingly has no idea what it's actually looking at. It's fake smart.

“These datasets may just be too simple: if they can be solved by detecting textures, why bother checking whether the shape matches, too?" said Geirhos.

"For humans, it is hard to imagine recognizing a car by detecting a specific tire pattern that only images from the 'car' category have, but for CNNs this might just be the easiest solution since the shape of an object is much bigger, and changes a lot depending on viewpoint, etc. Ultimately, we may need better datasets that don't allow for this kind of ‘cheating’."

Time for a tech fix

Back to the adversarial question: do these findings of an over-reliance on texture confirm why slightly altered colors and patterns in pictures fool neural networks? That corrupting a section of banana peel makes the code think it's looking at the texture of a shiny metal toaster?

To investigate this, the researchers built Stylized-ImageNet, a new dataset based on ImageNet. They scrubbed the original textures in the images and swapped them with a random texture, and then retrained a ResNet-50 model. Interestingly, although the CNN was more robust to the changes, it still fell victim to adversarial examples. So, no. The answer to our question is no.

“Even a model trained on Stylized-ImageNet is still susceptible to adversarial examples, so unfortunately a shape bias is not a solution to adversarial examples," Geirhos explained.

"However, current state-of-the-art CNNs are very susceptible to random noise such as rain or snow in the real world, [which is] a problem for autonomous driving. The fact that the shape-based CNN that I trained turned out to be much more robust on nearly all tested sorts of noise seems like a promising result on the way to more robust models.”

The texture versus shape problem may not sound like such a big deal, but it could have far reaching consequences. Some systems pretrained on ImageNet might not perform so well in other domains, like facial recognition or medical imaging.

Read Source Article: The Register

In Collaboration with HuntertechGlobal

China’s top search engine company Baidu made a smart cat shelter in Beijing that uses AI to verify when a cat is approaching and open its door. The cat shelter is heated and also offers cats food and water.

Besides running China’s main search engine, Baidu also works on AI tools in general and owns iQiyi, a Netflix-like rival that uses algorithms to determine what viewers may be interested in watching next. While cat shelters ordinarily seem out of the scope of what Baidu does, the company says that the idea first came to one employee, Wan Xi, who uncovered a small cat hiding in his car last winter and began to sympathize with the plight of other stray cats. Wan then apparently shut himself at home to develop software and work on a possible solution, using tools from Baidu’s AI team. Then, consulting with volunteer groups, Baidu created the actual physical shelters as a team effort.

Baidu is based in Beijing, where temperatures can drop to 15 degrees Fahrenheit (-9 degrees Celsius) in the winter, leaving stray cats in pretty dire conditions. Baidu wrote in a blog post that only 40 percent of stray cats survive the winter on average. While the backstory and the technology itself feels a bit gimmicky, this does appear to be a genuinely good application of artificial intelligence to benefit stray animals.

While scanning a cat’s face at the door, the cameras are also apparently capable of checking the cat for diseases and also to see if the cat has been neutered by trying to spot an ear tag. If a sick or non-neutered cat is discovered, the system will ping a nearby volunteer group to provide aid to the cat. Baidu also mentions in its blog post that many stray cats tend to not be neutered, meaning that they can just continue to mate and spawn more cats, worsening the living conditions of the cats overall.

After the cat enters the shelter, the door will shut behind it to prevent any other critters or stray dogs from entering. (The developers seem a little biased against stray dogs.) The cats themselves can venture onward to a living room of sorts.

The AI system is apparently capable of recognizing 174 different kinds of cats. The cameras also are equipped with night vision so that if any cats wander around at night, they can still enter or exit the shelters. The system can recognize four common kinds of cat disease, including stomatitis, skin disease, and external injuries.

AI is being used on animals more and more. There are examples of it being used in projects aimed at wildlife preservation and even in reuniting owners with lost pets. Most of these efforts are trials and experiments with the nascent technology.

One of the challenges of capturing the faces of animals with AI is to get them to point their faces to the camera. In Baidu’s case, however, it seems that the doors to the cat-sized shelters are small enough that the camera perched on top should be able to get a good view of the cat’s face.

Read Source Article :The Verge

In Collaboration with HuntrtechGlobal

Deep-learning algorithm helps to diagnose conditions that aren’t readily apparent to doctors or researchers.

In a paper1 published on 7 January in Nature Medicine, researchers describe the technology behind the diagnostic aid, a smartphone app called Face2Gene. It relies on machine-learning algorithms and brain-like neural networks to classify distinctive facial features in photos of people with congenital and neurodevelopmental disorders. Using the patterns that it infers from the pictures, the model homes in on possible diagnoses and provides a list of likely options.

Doctors have been using the technology as an aid, even though it's not intended to provide definitive diagnoses. But it does raise a number of ethical and legal concerns, say researchers. These include ethnic bias in training data sets and the commercial fragmentation of databases, both of which could limit the reach of the diagnostic tool.

Researchers at FDNA, a digital-health company in Boston, Massachusetts, first trained the artificial intelligence (AI) system to distinguish Cornelia de Lange syndrome and Angelman syndrome — two conditions with distinct facial features — from other similar conditions. They also taught the model to classify different genetic forms of a third disorder known as Noonan syndrome.

Then the researchers, led by FDNA chief technology officer Yaron Gurovich, fed the algorithm more than 17,000 images of diagnosed cases spanning 216 distinct syndromes. When presented with new images of people’s faces, the app’s best diagnostic guess was correct in about 65% of cases. And when considering multiple predictions, Face2Gene's top-ten list contained the right diagnosis about 90% of the time.

Narrowing the field

Eventually, FDNA wants to develop this technology to help other companies filter, prioritize and interpret genetic variants of unknown significance during DNA analysis. But to train its models, FDNA needs data.

So the Face2Gene app is currently available for free to healthcare professionals, many of whom use the system as a kind of second opinion for diagnosing rarely seen genetic disorders, says study co-author Karen Gripp, a medical geneticist at the Nemours/Alfred I. duPont Hospital for Children in Wilmington, Delaware. It can also provide a starting point in cases in which a doctor doesn’t know what to make of a patient’s symptoms. “It’s like a Google search,” Gripp says.

Gripp, who is also FDNA’s chief medical officer, used the algorithm to help diagnose Wiedemann–Steiner syndrome in a young girl she treated last August. Although a little short for her age, the four-year-old didn’t have many of the syndrome’s distinguishing physical features, other than the fact she had lost most of her baby teeth and several adult teeth were already coming in.

Gripp had read case reports describing premature dental growth in children with Wiedemann–Steiner syndrome, an exceedingly rare disorder caused by mutations in a gene called KMT2A. To shore up confidence in the diagnosis, Gripp uploaded a photo of her young patient to Face2Gene. Wiedemann–Steiner syndrome appeared among the software’s top hits.

Gripp subsequently confirmed the girl’s diagnosis with a targeted DNA test. But she says that the AI approach helped her to narrow down the possibilities and saved the cost of more expensive multi-gene panel testing.

‘Killing it’

The program’s accuracy has improved slightly as more healthcare professionals upload patient photos to the app, says Gurovich. There are now some 150,000 images in its database.

And in an unofficial comparison conducted between Face2Gene and clinicians last August at a workshop on birth defects, the program outperformed the people. Charles Schwartz, a geneticist at the Greenwood Genetic Center in Greenwood, South Carolina, distributed facial pictures of ten children with “fairly recognizable” syndromes and asked attendees to come up with the correct diagnoses.

In only two instances did more than 50% of the 49 participating clinical geneticists pick the right syndrome. Face2Gene made the right call for seven of the pictures.

“We failed miserably, and Face2Gene killed it,” says Paul Kruszka, a clinical geneticist at the US National Human Genome Research Institute in Bethesda, Maryland. Soon, he says, “I think every paediatrician and geneticist will have an app like this and will use it just like their stethoscope”.

Silos and bias

But the algorithm is only as good as its training data set — and there’s a risk, especially where rare disorders that affect only small numbers of people worldwide are concerned, that companies and researchers will begin to silo and commodify their data sets. “That threatens the main potential good of this technology,” says Christoffer Nellåker, a computational biologist at the University of Oxford, UK, who has spearheaded efforts to facilitate data-sharing in this field.

And ethnic bias in training data sets that contain mostly Caucasian faces remains a concern. A 2017 study2 of children with an intellectual disability found that whereas Face2Gene’s recognition rate for Down syndrome was 80% among white Belgian children, it was just 37% for black Congolese children. With a more-diverse training data set, however, the algorithm’s accuracy for African faces improved, showing that more-equitable representation of diverse populations is achievable.

“We know this problem needs to be addressed,” says Gurovich, “and as we move forward we’re able to have less and less bias.”


Unlike humans and animals, machines have a very hard time in order to recognize a particular object and images. With recent developments in the field of machine learning and deep learning, researchers have been successful in solving this issue and proving the machines with image recognition ability.

In simple terms, image recognition is the machine’s ability to see and recognize certain objects, people, actions etc. A combination of machine vision technology and artificial intelligence is used in order to achieve image recognition.

How does it work?

The working of this technology is a fairly complex procedure. In basic terms, the computer system is trained with tons of images in the building and developing process so that it is able to detect an object from different angles, analyse the corners and edges of the object etc. and form a rough image or a 3D model and then compare it with the images that gets automatically stored in its system with the help of deep learning technology.

The more images the computer recognizes, the more accurate it will be. The best and most accurate image recognition is performed on the convolutional neural net processor which is a processor that is developed just like the neurons in the human brain. 

Uses: -

Image recognition has been extensively used in two major industries that are listed below.

  1. E-commerce: - E-commerce industry is the largest industry that has taken measures in order to adopt this technology in their businesses. The companies aim at converting your smartphones into virtual showrooms and presenting the people the ability to have a more interactive experience while shopping online by making everything they see searchable.
  2. Automobiles: - Image recognition is an important part of the smart self-driving cars which are expected to see an obstacle and take necessary measures like braking, warning the driver etc. Reading road signs, slowing down near zebra crossings are examples of the inculcation of this technology in the industry.
  3. Image recognition on Social Media: - Various social media sites are extensively using image recognition in their operations such as tagging people, recognizing the location of the photo uploaded on Facebook, distinguishing between food, people, locations etc. and presenting the image to respective communities or pages etc.
  4. Security: - Facial and image recognition is a major substitute for the login password and id that we put in while logging in our favourite apps and websites like Facebook, YouTube, Snapchat etc.

 Other Applications: -

Current and future application examples of this technology is interactive media, smart photo libraries, image viewing for the visually impaired, specific target advertisements, image search, augmented reality, solving Sudoku puzzles etc.

© copyright 2017 All Rights Reserved.

A Product of HunterTech Ventures