GST in India
GST in India
GST in India
Machine learning is the subfield of computer science that gives computers the
ability to learn without being explicitly programmed. Machine learning explores
the study and construction of algorithms that can learn from and make
predictions on data such algorithms overcome following strictly static program
instructions by making data driven predictions or decisions through building a
model from sample inputs. Machine learning is employed in a range of
computing tasks where designing and programming explicit algorithms is
infeasible; example applications include spam filtering, detection of network
intruders or malicious insiders working towards a data breach, optical character
recognition, search engines and computer vision.
Suppose you go shopping for mangoes one day. The vendor has laid out a cart full of
mangoes. You can handpick the mangoes, the vendor will weigh them, and you pay
according to a fixed Rs per Kg rate (typical story in India).
Obviously, you want to pick the sweetest, most ripe mangoes for yourself (since you are
paying by weight and not by quality). How do you choose the mangoes?
You remember your grandmother saying that bright yellow mangoes are sweeter than
pale yellow ones. So you make a simple rule: pick only from the bright yellow mangoes.
You check the color of the mangoes, pick the bright yellow ones, pay up, and return
home. Happy ending?
Not quite.
Life is complicated
Suppose you go home and taste the mangoes. Some of them are not sweet as you'd
like. You are worried. Apparently, your grandmother's wisdom is insufficient. There is
more to mangoes than just color.
After a lot of pondering (and tasting different types of mangoes), you conclude that the
bigger, bright yellow mangoes are guaranteed to be sweet, while the smaller, bright
yellow mangoes are sweet only half the time (i.e. if you buy 100 bright yellow mangoes,
out of which 50 are big in size and 50 are small, then the 50 big mangoes will all be
sweet, while out of the 50 small ones, on average only 25 mangoes will turn out to be
sweet).
You are happy with your findings, and you keep them in mind the next time you go
mango shopping. But next time at the market, you see that your favorite vendor has gone
out of town. You decide to buy from a different vendor, who supplies mangoes grown
from a different part of the country. Now, you realize that the rule which you had learnt
(that big, bright yellow mangoes are the sweetest) is no longer applicable. You have to
learn from scratch. You taste a mango of each kind from this vendor, and realize that the
small, pale yellow ones are in fact the sweetest of all.
Now, a distant cousin visits you from another city. You decide to treat her with mangoes.
But she mentions that she doesn't care about the sweetness of a mango, she only wants
the most juicy ones. Once again, you run your experiments, tasting all kinds of mangoes,
and realizing that the softer ones are more juicy.
Now, you move to a different part of the world. Here, mangoes taste surprisingly different
from your home country. You realize that the green mangoes are in fact tastier than the
yellow ones.
You marry someone who hates mangoes. She loves apples instead. You go apple
shopping. Now, all your accumulated knowledge about mangoes is worthless. You have
to learn everything about the correlation between the physical characteristics and the
taste of apples, by the same method of experimentation. You do it, because you love her.
Now, imagine that all this while, you were writing a computer program to help you choose
your mangoes (or apples). You would write rules of the following kind:
if (color is bright yellow and size is big and sold by favorite vendor): mango is sweet.
if (soft): mango is juicy.
etc.
You would use these rules to choose the mangoes. You could even send your younger
brother with this list of rules to buy the mangoes, and you would be assured that he will
pick only the mangoes of your choice.
But every time you make a new observation from your experiments, you have to
manually modify the list of rules. You have to understand the intricate details of all the
factors affecting the quality of mangoes. If the problem gets complicated enough, it can
get really difficult to make accurate rules by hand that cover all possible types of
mangoes. Your research could earn you a PhD in Mango Science (if there is one).
ML algorithms are an evolution over normal algorithms. They make your programs
"smarter", by allowing them to automatically learn from the data you provide.
You take a randomly selected specimen of mangoes from the market (training data),
make a table of all the physical characteristics of each mango, like color, size, shape,
grown in which part of the country, sold by which vendor, etc (features), along with the
sweetness, juicyness, ripeness of that mango (output variables). You feed this data to
the machine learning algorithm (classification/regression), and it learns a model of the
correlation between an average mango's physical characteristics, and its quality.
Next time you go to the market, you measure the characteristics of the mangoes on sale
(test data), and feed it to the ML algorithm. It will use the model computed earlier to
predict which mangoes are sweet, ripe and/or juicy. The algorithm may internally use
rules similar to the rules you manually wrote earlier (for eg, a decision tree), or it may
use something more involved, but you don't need to worry about that, to a large extent.
Aindra Systems is one of the very few start up in India working in CV and ML .
Collaborating with some of the brilliant mind in Computer Vision and ML domain from
IIT Mandi, IIT Chennai, UMass Boston; they are progressively developing multiple
products to address the problems of huge social impact.
Aindra's unique and innovative products and its initiatives have got eye balls from
some of the well known authorities like The Economic Times , Your Story, Business
Insider ,VC Circle , tech crunch and many more from various geographies of the
globe.
Our work in this area is two fold. We design, engineer and manufacture our own
Machine Vision cameras in our office in Bangalore. Camera designs range from
those with little on board processing to Smart Cameras, which are user configurable
(semi-programmable) and can work without a separate processor in a closed loop
control system. The Computer Vision/Embedded System Groups develop
algorithms/libraries for our cameras.
Types of algorithms
The world has become incredibly data rich. Eric Schmidt once said "Every two days now
we create as much information as we did from the dawn of civilization up until 2003".
That is something like five exabytes of data. It has become extremely important to
process the data generated and see if any information can be understood from them.
Machine Learning can be used to make sense out the data by learning models/rules
from data. These models/rules could be used to predict what could the value be when
presented with a new set of input features. The data set which is used for building
models is usually referred as training set.
In supervised learning, each entry in a training set consists of a vector of input features
(represented as X) and a target variable (represented as y). From this training set, we
learn a vector of parameters which could be used predict values of the target
variables, for previously unseen or newly generated features. If the target variable to be
predicted is continuous valued output we call it a regression problem and if it is discrete
we call it a classification problem.
In unsupervised learning, we know only the set of input features in advance. In other
words, the set of input features will not have an associated target labels/values. The task
of unsupervised learning is to find structure among the input features and segregate
them into different classes. Google News uses unsupervised learning to identify
coherent group of news articles.
Reinforcement learning deals with how software agents interact with the environment
and take actions which maximize their reward or minimize the cost associated with each
action. For this, we need to know set of environment states S, set of actions A which can
be taken by the agent, rules for transitioning between the states, rules that determine the
reward/cost for transition and rules that describe what the agent observes.
Overall, all learning algorithms generally tend to undercover the underlying density of the
probability distribution and use it to predict future values. Machine learning is one of the
fastest growing fields and almost all top firms use it for product recommendations, spam
classification, stock market analysis etc.