Machine Learning

Machine learning (ML) has a rapidly increasing presence across industries. Top technology companies such as Amazon, Google, and Microsoft certainly talked a lot about ML's big impact on powering applications and services in 2017. Its usefulness continues to emerge in businesses of all sizes: automatically targeting segments of the market at marketing agencies, e-commerce product recommendations and personalization by retailers, and fraud prevention customer service chatbots at banks are examples.

Certainly ML is a hot topic, but there's another related trend that's gaining speed: automated machine learning (autoML).

What is Automated Machine Learning?

The field of autoML is evolving so quickly there's no universally agreed-upon definition. Fundamentally, autoML offers ML experts tools to automate repetitive tasks by applying ML to ML itself.

A recent Google Research article explains that "the goal of automating machine learning is to develop techniques for computers to solve new machine-learning problems automatically, without the need for human-machine learning experts to intervene on every new problem. If we're ever going to have truly intelligent systems, this is a fundamental capability that we will need."

What's Driving Interest

AI and machine learning require expert data scientists, engineers, and researchers, and there's a worldwide short supply right now. The ability of autoML to automate some of the repetitive tasks of ML compensates for the lack of AI/ML experts while boosting the productivity of their data scientists.

By automating repetitive ML tasks -- such as choosing data sources, data prep, and feature selection -- marketing and business analysts spend more time on essential tasks. Data scientists build more models in less time, improve model quality and accuracy, and fine-tune more new algorithms.

AutoML Tools for Citizen Data Scientists

More than 40 percent of data science tasks will be automated by 2020, according to Gartner. This automation will result in the increased productivity of professional data scientists and broader use of data and analytics by citizen data scientists. AutoML tools for this user group usually offer a simple point-and-click interface for loading data and building ML models. Most autoML tools focus on model building rather than automating an entire, specific business function such as customer analytics or marketing analytics.

Most autoML tools (and even most ML platforms) don't address the problem of data selection, data unification, feature engineering, and continuous data preparation. Keeping up with massive volumes of streaming data and identifying non-obvious patterns is a challenge for citizen data scientists. They are often not equipped to analyze real-time streaming data, and if data is not analyzed promptly, it can lead to flawed analytics and poor business decisions.

AutoML for Model Building Automation

Some companies are using autoML to automate internal processes, particularly building ML models. A few examples of companies using autoML for automating model building are Facebook and Google.

Facebook trains and tests a staggering number of ML models (about 300,000) every month. The company essentially built an ML assembly line to deal with so many models. Facebook has even created its own autoML engineer (named Asimo) that automatically generates improved versions of existing models.

Google is developing autoML techniques for automating the design of machine learning models and the process of discovering optimization methods. The company is currently developing a process for machine-generated architectures.

AutoML for the Automation of End-to-End Business Processes

Once a business problem is defined and the ML models are built, it's possible to automate entire business processes in some cases. It requires appropriate feature engineering and pre-processing of the data. Examples of companies actively using autoML for the whole automation of specific business processes include DataRobot, ZestFinance, and Zylotech.

DataRobot is designed for the whole automation of predictive analytics. The platform automates the entire modeling lifecycle which includes, but is not limited to, data ingestion, algorithm selection, and transformations. The platform is customizable so that it can be tailored for specific deployments such as building a large variety of different models and high-volume predictions. Data Robot helps data scientists and citizen data scientists quickly build models and apply algorithms for predictive analytics.

ZestFinance is designed for the whole automation of specific underwriting tasks. The platform automates data assimilation, model training and deployment, and explanations for compliance. ZestFinance employs machine learning to analyze traditional and nontraditional credit data to score potential borrowers who may have thin or no files. AutoML is also used to provide tools for lenders to train and deploy ML models for specific use cases such as fraud prevention and marketing. ZestFinance helps creditors and financial analysts make better lending decisions and risk assessments.

Zylotech is designed for the whole automation of customer analytics. The platform features an embedded analytics engine (EAE) with a variety of automated ML models, automating the entire ML process for customer analytics, including data prep, unification, feature engineering, model selection, and discovery of non-obvious patterns. Zylotech helps data scientists and citizen data scientists leverage complete data in near real time that enables one-to-one customer interactions.

AutoML Helps Businesses Use Machine Learning Successfully

You've probably heard the phrase "data is the new oil." It turns out data is now far more valuable than oil. However, just as crude oil needs to be "cracked" before it is turned into useful molecules, customer data must be refined before insights can be drawn from it with embedded models. Data is not instantly valuable; it must be collected, cleansed, enriched, and made analysis ready.

The autoML approach can help all businesses use machine learning successfully. Potential business insights are hidden in places where only machine learning can reach at scale. Whether you're in marketing, retail, or any other industry, AutoML is the methodology you need to extract and leverage that valuable resource.

Read Source Article : tdwi

In Collaboration with HuntertechGlobal

#AI #MachineLearning #DeepLearning #Research #ArtificialIntelligence #Analytics #DataScience #Technology #Marketing #BigData #AIHealthcare #IoT

Researchers from India's Thapar Institute of Engineering and Technology have developed a machine learning-based solution that enables the real-time inspection of solar panels.

Research scholar Parveen Bhola and associate professor Saurabh Bhardwaj used past meteorological data to compute performance ratios and degradation rates in solar panels, leading to the development of a new application for clustering-based computation which increases the ability to speed-up inspection processes and prevent further damage.

According to Bhola and Bhardwaj, their proposed method could improve the accuracy of current solar power forecasting models, while real-time estimation and inspection will enable real-time rapid response for maintenance.

"The majority of the techniques available calculate the degradation of photovoltaic (PV) systems by physical inspection onsite," remarked Bhola. "This process is time-consuming, costly and cannot be used for the real-time analysis of degradation. The proposed model estimates the degradation in terms of performance ratio in real time."

The researchers developed a model that estimates solar radiation through a combination of the Hidden Markov Model, used to model randomly changing systems with unobserved or hidden states, and the Generalized Fuzzy Model, which attempts to use imprecise information in its modeling process.

Both models can be used to adapt PV system inspection methods through the use of recognition, classification, clustering and information retrieval.

"As a result of real-time estimation, the preventative action can be taken instantly if the output is not per the expected value," Bhola added. "This information is helpful to fine-tune the solar power forecasting models. So, the output power can be forecasted with increased accuracy."

Read Source Article Innovation Enterprise

In Collaboration with HuntertechGlobal

#AI #MachineLearning #DeepLearning #Research #ArtificialIntelligence #Analytics #DataScience #Technology #Marketing #BigData #AIHealthcare #IOT


Scammers may be tricking vast numbers of unsuspecting customers into giving up their personal details so that fraudulent transactions can take place – but these crafty thieves may have met their match in machine learning.

Vishers, phishers and smishers belong to a category of criminals called social engineering fraudsters, meaning they trick their victim into either disclosing confidential financial details or transferring money to a criminal.

In South Africa, data released by the SA Banking Risk Information Centre (Sabric) earlier this year revealed that more than half (55%) of the gross losses due to crime reported were from incidents that had occurred online.

Phishers, smishers, vishers – what next?


Phishers typically try to get personal details via email, smishers try their luck by sms, and vishers are best known for their telephonic skills.

Dr Scott Zoldi, chief analytics officer at analytic software firm FICO, says vishing is an especially great risk around tax season.

"Phone call social engineering fraud – known as vishing – has gained in popularity of late, and relies on the fraudster’s powers of persuasion in conversation with their victim," he says.

"This type of SEF spikes around tax season when fraudsters claim to be the South African Revenues (SARS), and use spoofing to make the calls appear as if they originate from official phone numbers."

Victims may be told they will go to jail if they don't make a payment, or that a refund is due – but their bank details are needed in order to get it.

And, says Zoldi, as security settings advance and real-time payment schemes such as online banking transfers or banking transfers become easier, scammers are favouring tricking their victims into depositing the money themselves (authorised push payment scams) rather than stealing the money through compromised account authentication (unauthorised push payment transactions).

This means the key to beating tricksters is not through tighter security – but through targeting behaviour.

No match for machines

But Zoldi says these crafty tricksters have met their match – and it's machine learning.

Sometimes, he says, "computer says no" is the best answer.

Advances in machine learning mean it is becoming easier to stay one step ahead of social engineering fraudsters, he says.

"The good news is that machine learning models can counteract SEF techniques," he says.

These machine learning models are designed to detect the broad spectrum of fraud types attacking financial institutions, building and updating behavioural profiles online and in real time.

They monitor payment characteristics such as transaction amounts and how quickly payments are being made. This means they can – by recognising patterns – detect both generic fraud characteristics, and patterns that only appear in certain types of fraud, such as social engineering fraud.

"In SEF scenarios, the above-mentioned behaviours will appear out of line with normal transactional activity and generate higher fraud risk scores," says Zoldi.

The machine learning model can also keep track of the way various common transactions intersect either for the customer or within the individual account, for example by tracking a list of beneficiaries the customer pays regularly, the devices previously used to make payments, typical amounts, locations, times and so forth.

Digging deeper

"FICO’s research has shown that transactions made out of character are more than 40 times riskier than those that follow at least one established behaviour," says Zoldi.

Machine learning models can also track these risky non-monetary events, such as a change of email, address or phone number, which can often precede fraudulent monetary transactions.

Authorised push payments are a bit more difficult, he explains, because customers can be so panicked by the social engineering fraudster that when the bank intervenes, the customer distrusts, ignores, or resists the bank’s efforts to protect their accounts.

But, he says, even then, typical anticipated behaviours can be used, based on extensive profiling of the true customer’s past actions.

"We are incorporating collaborative profile technology to bring additional cross-customer understanding of the new behaviours of similar banking customers. These methods can be used to home in on individuals that are often targeted for authorised push payments and trigger the bank’s intervention," he explains.

"Fraudsters have always targeted the weakest link in the banking process. As systems become more and more secure, the weakest link, increasingly, are customers themselves.

"However, by analysing the way each customer normally uses their account, banks can detect transactions that are out of character and stop them before any money disappears, which will make social engineering scams less profitable."

Customer profiling will also help prevent fraud in real time, he says.

"We are incorporating collaborative profile technology to bring additional cross-customer understanding of the new behaviours of similar banking customers. These methods can be used to home in on individuals that are often targeted for authorised push payments and trigger the bank’s intervention," he explains.

"Fraudsters have always targeted the weakest link in the banking process. As systems become more and more secure, the weakest link, increasingly, are customers themselves.

"However, by analysing the way each customer normally uses their account, banks can detect transactions that are out of character and stop them before any money disappears, which will make social engineering scams less profitable."

Custosmer profiling will also help prevent fraud in real time, he says.

Read Source Article: Fin24

#AI #MachineLearning #DeepLearning #DataScience

Neural networks are famously incomprehensible — a computer can come up with a good answer, but not be able to explain what led to the conclusion. Been Kim is developing a “translator for humans” so that we can understand when artificial intelligence breaks down.

Been Kim, a research scientist at Google Brain, is developing a way to ask a machine learning system how much a specific, high-level concept went into its decision-making process.

If a doctor told that you needed surgery, you would want to know why — and you’d expect the explanation to make sense to you, even if you’d never gone to medical school. Been Kim, a research scientist at Google Brain, believes that we should expect nothing less from artificial intelligence. As a specialist in “interpretable” machine learning, she wants to build AI software that can explain itself to anyone.

Since its ascendance roughly a decade ago, the neural-network technology behind artificial intelligence has transformed everything from email to drug discovery with its increasingly powerful ability to learn from and identify patterns in data. But that power has come with an uncanny caveat: The very complexity that lets modern deep-learning networks successfully teach themselves how to drive cars and spot insurance fraud also makes their inner workings nearly impossible to make sense of, even by AI experts. If a neural network is trained to identify patients at risk for conditions like liver cancer and schizophrenia — as a system called “Deep Patient” was in 2015, at Mount Sinai Hospital in New York — there’s no way to discern exactly which features in the data the network is paying attention to. That “knowledge” is smeared across many layers of artificial neurons, each with hundreds or thousands of connections.

As ever more industries attempt to automate or enhance their decision-making with AI, this so-called black box problem seems less like a technological quirk than a fundamental flaw. DARPA’s “XAI” project (for “explainable AI”) is actively researching the problem, and interpretability has moved from the fringes of machine-learning research to its center. “AI is in this critical moment where humankind is trying to decide whether this technology is good for us or not,” Kim says. “If we don’t solve this problem of interpretability, I don’t think we’re going to move forward with this technology. We might just drop it.”

Kim and her colleagues at Google Brain recently developed a system called “Testing with Concept Activation Vectors” (TCAV), which she describes as a “translator for humans” that allows a user to ask a black box AI how much a specific, high-level concept has played into its reasoning. For example, if a machine-learning system has been trained to identify zebras in images, a person could use TCAV to determine how much weight the system gives to the concept of “stripes” when making a decision.

TCAV was originally tested on machine-learning models trained to recognize images, but it also works with models trained on text and certain kinds of data visualizations, like EEG waveforms. “It’s generic and simple — you can plug it into many different models,” Kim says.

Quanta Magazine spoke with Kim about what interpretability means, who it’s for, and why it matters. An edited and condensed version of the interview follows.

You’ve focused your career on “interpretability” for machine learning. But what does that term mean, exactly?

There are two branches of interpretability. One branch is interpretability for science: If you consider a neural network as an object of study, then you can conduct scientific experiments to really understand the gory details about the model, how it reacts, and that sort of thing.

The second branch of interpretability, which I’ve been mostly focused on, is interpretability for responsible AI. You don’t have to understand every single thing about the model. But as long as you can understand just enough to safely use the tool, then that’s our goal.

But how can you have confidence in a system that you don’t fully understand the workings of?

I’ll give you an analogy. Let’s say I have a tree in my backyard that I want to cut down. I might have a chain saw to do the job. Now, I don’t fully understand how the chain saw works. But the manual says, “These are the things you need to be careful of, so as to not cut your finger.” So, given this manual, I’d much rather use the chainsaw than a handsaw, which is easier to understand, but would make me spend five hours cutting down the tree.

You understand what “cutting” is, even if you don’t exactly know everything about how the mechanism accomplishes that.

Yes. The goal of the second branch of interpretability is: Can we understand a tool enough so that we can safely use it? And we can create that understanding by confirming that useful human knowledge is reflected in the tool.

How does “reflecting human knowledge” make something like a black box AI more understandable?

Here’s another example. If a doctor is using a machine-learning model to make a cancer diagnosis, the doctor will want to know that the model isn’t picking up on some random correlation in the data that we don’t want to pick up. One way to make sure of that is to confirm that the machine-learning model is doing something that the doctor would have done. In other words, to show that the doctor’s own diagnostic knowledge is reflected in the model.

So if doctors were looking at a cell specimen to diagnose cancer, they might look for something called “fused glands” in the specimen. They might also consider the age of the patient, as well as whether the patient has had chemotherapy in the past. These are factors or concepts that the doctors trying to diagnose cancer would care about. If we can show that the machine-learning model is also paying attention to these factors, the model is more understandable, because it reflects the human knowledge of the doctors.


Video: Google Brain’s Been Kim is building ways to let us interrogate the decisions made by machine learning systems.

Rachel Bujalski for Quanta Magazine

Is this what TCAV does — reveal which high-level concepts a machine-learning model is using to make its decisions?

Yes. Prior to this, interpretability methods only explained what neural networks were doing in terms of “input features.” What do I mean by that? If you have an image, every single pixel is an input feature. In fact, Yann LeCun [an early pioneer in deep learning and currently the director of AI research at Facebook] has said that he believes these models are already superinterpretable because you can look at every single node in the neural network and see numerical values for each of these input features. That’s fine for computers, but humans don’t think that way. I don’t tell you, “Oh, look at pixels 100 to 200, the RGB values are 0.2 and 0.3.” I say, “There’s a picture of a dog with really puffy hair.” That’s how humans communicate — with concepts.

How does TCAV perform this translation between input features and concepts?

Let’s return to the example of a doctor using a machine-learning model that has already been trained to classify images of cell specimens as potentially cancerous. You, as the doctor, may want to know how much the concept of “fused glands” mattered to the model in making positive predictions of cancer. First you collect some images — say, 20 — that have examples of fused glands. Now you plug those labeled examples into the model.

Then what TCAV does internally is called “sensitivity testing.” When we add in these labeled pictures of fused glands, how much does the probability of a positive prediction for cancer increase? You can output that as a number between zero and one. And that’s it. That’s your TCAV score. If the probability increased, it was an important concept to the model. If it didn’t, it’s not an important concept.

“Concept” is a fuzzy term. Are there any that won’t work with TCAV?

If you can’t express your concept using some subset of your [dataset’s] medium, then it won’t work. If your machine-learning model is trained on images, then the concept has to be visually expressible. Let’s say I want to visually express the concept of “love.” That’s really hard.

We also carefully validate the concept. We have a statistical testing procedure that rejects the concept vector if it has the same effect on the model as a random vector. If your concept doesn’t pass this test, then the TCAV will say, “I don’t know. This concept doesn’t look like something that was important to the model.”

Photo of Been Kim

Rachel Bujalski for Quanta Magazine

Is TCAV essentially about creating trust in AI, rather than a genuine understanding of it?

It is not — and I’ll explain why, because it’s a fine distinction to make.

We know from repeated studies in cognitive science and psychology that humans are very gullible. What that means is that it’s actually pretty easy to fool a person into trusting something. The goal of interpretability for machine learning is the opposite of this. It is to tell you if a system is not safe to use. It’s about revealing the truth. So “trust” isn’t the right word.

So the point of interpretability is to reveal potential flaws in an AI’s reasoning?

Yes, exactly.

How can it expose flaws?

You can use TCAV to ask a trained model about irrelevant concepts. To return to the example of doctors using AI to make cancer predictions, the doctors might suddenly think, “It looks like the machine is giving positive predictions of cancer for a lot of images that have a kind of bluish color artifact. We don’t think that factor should be taken into account.” So if they get a high TCAV score for “blue,” they’ve just identified a problem in their machine-learning model.

TCAV is designed to bolt on to existing AI systems that aren’t interpretable. Why not make the systems interpretable from the beginning, rather than black boxes?

There is a branch of interpretability research that focuses on building inherently interpretable models that reflect how humans reason. But my take is this: Right now you have AI models everywhere that are already built, and are already being used for important purposes, without having considered interpretability from the beginning. It’s just the truth. We have a lot of them at Google! You could say, “Interpretability is so useful, let me build you another model to replace the one you already have.” Well, good luck with that.

So then what do you do? We still need to get through this critical moment of deciding whether this technology is good for us or not. That’s why I work “post-training” interpretability methods. If you have a model that someone gave to you and that you can’t change, how do you go about generating explanations for its behavior so that you can use it safely? That’s what the TCAV work is about.

Photo of Been Kim writing in her notebook.

Rachel Bujalski for Quanta Magazine


TCAV lets humans ask an AI if certain concepts matter to it. But what if we don’t know what to ask — what if we want the AI system to explain itself?

We have work that we’re writing up right now that can automatically discover concepts for you. We call it DTCAV — discovery TCAV. But I actually think that having humans in the loop, and enabling the conversation between machines and humans, is the crux of interpretability.

A lot of times in high-stakes applications, domain experts already have a list of concepts that they care about. We see this repeat over and over again in our medical applications at Google Brain. They don’t want to be given a set of concepts — they want to tell the model the concepts that they are interested in. We worked with a doctor who treats diabetic retinopathy, which is an eye disease, and when we told her about TCAV, she was so excited because she already had many, many hypotheses about what this model might be doing, and now she can test those exact questions. It’s actually a huge plus, and a very user-centric way of doing collaborative machine learning.

You believe that without interpretability, humankind might just give up on AI technology. Given how powerful it is, do you really think that’s a realistic possibility?

Yes, I do. That’s what happened with expert systems. [In the 1980s] we established that they were cheaper than human operators to conduct certain tasks. But who is using expert systems now? Nobody. And after that we entered an AI winter.

Right now it doesn’t seem likely, because of all the hype and money in AI. But in the long run, I think that humankind might decide — perhaps out of fear, perhaps out of lack of evidence — that this technology is not for us. It’s possible.

Read Source article :quantamagazine

 #AI #MachineLearning #ArificialItelligence #DataScience #Technology #Robotics

Despite acknowledging the value of AI and machine learning technologies most business organizations are lagging behind in using them reports a survey from RELX Group.

More than 91 percent of senior executives polled understood that AI and machine learning were important but only 56 percent were using them and 18 percent could not say how the technologies were being used in their business.

RELX says the gap in adoption is due to senior leadership not being able to communicate the benefits of the technologies to employees and how or why they are being used.

"While awareness of these technologies and their benefits is higher than ever before, endorsement from key decision makers has not been enough to spark matching levels of adoption." said Kumsal Bayazit, Chair of RELX Group's Technology Forum.

"Acknowledging that the world is changing needs to be coupled with significant investment and focus on these emerging technologies to stay competitive in today's business landscape."

The top uses for these technologies is to improve worker productivity (51 percent), to make better business decisions (41%), and to streamline business processes (39 percent).

Additional findings: Companies are saving money by automating decisions (40 percent), retaining customers longer (36 percent) and easier detection of fraud and waste (33 percent).

RELX recently opened an AI Lab in Amsterdam so that it's scientists can work with Dutch universities to combine developments in academia and from the industry. It recently published open access research on AI and machine learning as part of its free Artificial intelligence Resource Center.

RELX Group has a valuation of $41 billion. It used to be a major media company with hundreds of magazines and newspapers and was better known as Reed-Elsevier. It has successfully transitioned into an information and data company selling a variety of services to businesses and at a far higher valuation than if it had remained a media publisher.

Page 3 of 4

© copyright 2017 All Rights Reserved.

A Product of HunterTech Ventures