Jekyll2017-06-22T14:03:31+00:00http://amanullahtariq.com//Amanullah TariqSoftware Engineer and Data ScientistNeubeez - Hackathon Freiburg 20162017-06-21T00:00:00+00:002017-06-21T00:00:00+00:00http://amanullahtariq.com//neubeez<p>The second Freiburg hackathon took place in 2016, the theme of Hackathon was “Newcomers” to the city of Freiburg. SteepMinds took this challenge and came-up with an idea to encourage the people who are already living in Freiburg to help newcomers by rewarding them discounts in shops in Freiburg.</p>
<h2 id="neubeez">Neubeez</h2>
<p>Neubeez was a product focusing on helping the newcomers of a society. The powerful community of Neubeez will be able to help people with different legal and social questions they have about the society or the city itself.</p>
<p>The platform gamify the process of helping the newcomers by encouraging experts to answer questions from the newcomers and in return gain points from it. These points in turn can be usable by the people to buy vouchers and discounts from different brands existing on our platform.</p>
<p>Our vision was to build a community which can help the newcomers of a society to settle down and blend in more easily. For this Won 1st prize in Freiburg Hackathon 2016.</p>
<p><img src="http://amanullahtariq.com//images/hackathon/neubeez-hackathon.jpg" alt="Hackathon Award" style="margin:auto; height:500px" /></p>
<h3 id="the-technologies">The Technologies</h3>
<p>Our application target both Newcomers and the people already living in the Freiburg so we decided to create a multi-platform application, Since newcomers mostly use smartphones and tablets while people already living here would mostly be using it from there laptop. Our idea can be extended and cover many expects but for the sake of Hackathon we have to specify project goals as:</p>
<p>As we where targetting multi-platform application we decided to work on Nodejs and Cordova in this way we can develop a application which runs on every platform.</p>
<h3 id="technical-doing">Technical doing</h3>
<p>The project was then split into the subprojects UX-concept, Data Layer, Office and Queue Management, QRCode Scanning, and Document Storage. One developer was responsible for one subproject. Several times during the Hackathon, developers switched subprojects.</p>
<h3 id="ux-concept">UX-concept</h3>
<p>Working on the UX-concept and on the interface design started at the day of the Hackathon. The aim of the UX-concept was to make an interface that is easy to understand and user friendly. The user should always be guided to the best action, moving through the application in a few simple steps and reaching the desired goal quickly and easily.</p>
<p>To achieve the UX-concept, it was important to start from the user’s point of view by reorganizing and researching online for the same type of applications.</p>
<p>For the design we came up with very cool idea to take as less information from the user as possible, our application main idea was to help user. So for viewing the already posted answers no information is require from the user and take only user email or facebook or twitter account to start posting new question.</p>
<h3 id="data-layer">Data layer</h3>
<p>While the design was being finalized, the developers started on the data layer.</p>
<p><img src="http://amanullahtariq.com//images/hackathon/neulingoV1.png" alt="Hackathon class model" style="margin:auto" /></p>
<h3 id="features">Features</h3>
<ol>
<li>
<p>A multi-platform application where Neubeez, people coming from different countries, post there day to day queries or problems regarding Freiburg. For that we decide to build a forum where Neubeez can post there questions and people can answer them.</p>
</li>
<li>
<p>We want to build the community for that we introduces a gamification process so people who are helping Neubeez get points, they can see where they stand in leaderboard.</p>
</li>
<li>
<p>People get points for posting the new question, new answer and also other people can upvote already posted answers.</p>
</li>
<li>
<p>To motivate people we introduces a concept of utilizing the points, people can also use the points they have earn by answering question and use them in the local shops of Freiburg.</p>
</li>
</ol>
<p><img src="http://amanullahtariq.com//images/hackathon/view1.png" alt="Hackathon class model" style="margin:auto" /></p>
<h3 id="our-team">Our Team</h3>
<p>Our team (SteepMinds) was mainly compromised of the students from Univeristy Of Freiburg. Aman, Shayan, Zaid and Muazzam are currently pursuing Master’s Degree in Computer Science major in machine learning. We all are working in industry for 5+ years where they have developed lots of software (Web and Mobile apps).</p>
<p>Lets meet our team:</p>
<h4 id="aman-backend-nodejs-cordova-and-restful-api">Aman backend (Nodejs, Cordova and Restful Api)</h4>
<p><img src="http://amanullahtariq.com//images/avatar.jpg" alt="Aman" style="margin:auto" /></p>
<h4 id="muazzam-backend-nodejs-cordova-and-restful-api">Muazzam backend (Nodejs, Cordova and Restful Api)</h4>
<p><img src="http://amanullahtariq.com//images/muazzam.jpg" alt="Muazzam" style="margin:auto" /></p>
<h4 id="zaid-ux-concept-and-rd">Zaid UX concept and R&D</h4>
<p><img src="http://amanullahtariq.com//images/zaid.jpg" alt="Zaid" style="margin:auto" /></p>
<h4 id="shayan-ux-concept-and-rd">Shayan UX concept and R&D</h4>
<p><img src="http://amanullahtariq.com//images/shayan.jpg" alt="Shayan" style="margin:auto" /></p>
<h3 id="conclusion">Conclusion</h3>
<p>Getting a project done in two days is not an easy task. It requires great effort and teamwork. Building a team consisting of members from different experties is also not easy.</p>
<p>In the end, we completed the project and it was possible to produce a prototype within 48 hours. For this kind of a project, our takeaway is that you have to concentrate on the core functions and use as much external code as possible (Nuget, OpenSource Libraries etc.). Another takeaway was that it is good to have a finished design concept at the beginning of the project. The completed UX-concept helped guide our coding.</p>
<p>Freiburg Hackathon was a good challenge to learn new things and test our skills in a “new project environment”. We all liked the concept of the hackathon. Finally, we believe our idea is a good concept that will be a really nice, working product.</p>Amanullah TariqThe second Freiburg hackathon took place in 2016, the theme of Hackathon was “Newcomers” to the city of Freiburg. SteepMinds took this challenge and came-up with an idea to encourage the people who are already living in Freiburg to help newcomers by rewarding them discounts in shops in Freiburg.Gradient Descent2017-02-01T00:00:00+00:002017-02-01T00:00:00+00:00http://amanullahtariq.com//gradient-descent<p>Gradient descent is one of the basic algorithm in machine learning, to master machine learning and deep learning implementing gradient descent is must. In this post, we write a simple code to implement <strong>Linear Regression</strong> using gradient descent.</p>
<h2 id="gradient-descent">Gradient Descent</h2>
<p>An intuitive way of explaining Gradient Descent is to imagine a mountain and the path of for snowboarding in the mountain. The goal of the snowboarding is to reach the bottom it is exactly the same what gradient descent strives to achieve.</p>
<p><img src="../images/GradientDescent/minima.jpg" alt="Global Minima" /></p>
<p>But few hills can have more than on lower points this means that they can have multiple local minima but single global minima.</p>
<p><img src="../images/GradientDescent/graph-local-global.png" alt="Local Minima" /></p>
<p>Suppose we were told that you need to get to the lowest point possible. Between every two hills is a low point, not necessarily the lowest point. Once you started our gradient descent it doesnot have any power to take you up again to the second hill. For this we have to completely start a new descent down the second hill.</p>
<p>In this we will implement Linear regression using gradient descent. The main idea behind is that, In each iteration we will calculate the error and according to that we will try to find the new value <strong>y-intercept (b)</strong> and <strong>slope(m)</strong> that classify our data better than before.</p>
<p><img src="https://raw.githubusercontent.com/mattnedrich/GradientDescentExample/master/gradient_descent_example.gif" alt="Gradient Descent" /></p>
<h3 id="to-learn-more-about-gradient-descent">To learn more about gradient descent</h3>
<ul>
<li>
<table>
<tbody>
<tr>
<td>Gradient Descent (<a href="https://github.com/amanullahtariq/MLAlgorithm/blob/master/Challenge/LinearRegression/GradientDescent/GradientDescent.ipynb">notebook</a></td>
<td><a href="https://github.com/amanullahtariq/MLAlgorithm/blob/master/Challenge/LinearRegression/GradientDescent/manual.py">code</a> )</td>
</tr>
</tbody>
</table>
</li>
<li><a href="http://machinelearningmastery.com/gradient-descent-for-machine-learning/">Gradient Descent For Machine Learning</a></li>
<li><a href="https://www.quora.com/What-is-an-intuitive-explanation-of-gradient-descent">Quora: Gradient Descent</a></li>
<li><a href="https://spin.atomicobject.com/2014/06/24/gradient-descent-linear-regression/">An Introduction to Gradient Descent and Linear Regression</a></li>
</ul>
<h3 id="implementation-of-gradient-descent">Implementation of Gradient Descent</h3>
<p>Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost).
So lets implement the idea what we have discussed above. We will write Gradient Descent manually without using any external machine learning libraries to understand the intuition behind it. Since, Gradient Descent is consider as the base algorithm for Machine Learning.</p>
<h4 id="import-libraries">Import Libraries</h4>
<p>We only use <strong>numpy</strong> to use some math function in our code and <strong>matplotlib</strong> for plotting the data.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>from numpy import *
import matplotlib.pyplot as plt
</code></pre>
</div>
<h4 id="load-the-dataset">Load the dataset</h4>
<p>Data is present in the csv file <strong>data.csv</strong>. After reading the data from the file we will save in the variable call <strong>points</strong> so we can use it afterwards.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>points = genfromtxt("input/data.csv", delimiter=",")
print (points[0:10,:])
print('Shape: {0} '.format(points.shape))
</code></pre>
</div>
<p><img src="../images/GradientDescent/loaddata.png" alt="Data plot" /></p>
<h4 id="initialize-hyper-parameters">Initialize hyper-parameters</h4>
<p>Next, we will initialize all the hyper-parameters with the initial value so we can start the algorithm.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>b = 0
m = 0
learning_rate = 0.0001 #alpha
num_iterations = 1000 #number of iteration
</code></pre>
</div>
<h4 id="error-function">Error function</h4>
<p>We compute mean square error by calculating difference between original output <strong>y</strong> and predicted output <strong>y^</strong>.</p>
<p><img src="../images/GradientDescent/linear_regression_error1.png" alt="derivative" /></p>
<div class="highlighter-rouge"><pre class="highlight"><code>def ComputerError(b,m,points):
totalError = 0
for i in range(len(points)):
x = points[i, 0]
y = points[i, 1]
totalError += (y - (m * x + b)) ** 2
return totalError / float(len(points))
</code></pre>
</div>
<h4 id="gradient-descent-1">Gradient Descent</h4>
<p>This methods is used to updated the <strong>y-intercept</strong> and <strong>slope</strong> by computing the partial derivative with respect to <strong>y-intercept</strong> and <strong>slope</strong>. For this we use the below equations.</p>
<p><img src="../images/GradientDescent/linear_regression_gradient1.png" alt="derivative" /></p>
<div class="highlighter-rouge"><pre class="highlight"><code>def GetGradientDescent(b_current, m_current, points,learning_rate):
b_gradient = 0
m_gradient = 0
N = float(len(points))
for i in range(0, len(points)):
x = points[i, 0]
y = points[i, 1]
b_gradient += -(2/N) * (y - ((m_current * x) + b_current))
m_gradient += -(2/N) * x * (y - ((m_current * x) + b_current))
new_b = b_current - (learning_rate * b_gradient)
new_m = m_current - (learning_rate * m_gradient)
return [new_b, new_m]
</code></pre>
</div>
<p>After writing a method to computer new <strong>Slope</strong> and <strong>Y-intercept</strong> we run the Gradient descent algorithm for prediction of the line.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>def RunAlgorithm(starting_b, starting_m, learning_rate, num_iterations, points):
b = starting_b
m = starting_m
for i in range(num_iterations):
b, m = GetGradientDescent(b, m, array(points), learning_rate)
error = ComputerError(b, m, points)
if error < float64(2.0):
print(error)
return [b, m]
if i % 100 == 0:
print ("After {0} iterations b = {1}, m = {2}, error = {3}".format(i, b, m, ComputerError(b, m, points)))
return [b, m]
</code></pre>
</div>
<h4 id="results">Results</h4>
<p>Below is the snapshots of gradient descent running after 1000 iterations for our example problem. We start out at point m = 0 and b = 0. Each iteration m and b are updated to values that yield slightly lower error than the previous iteration. The plot displays the current location of the gradient descent search (red dot) and the path taken to get there (blue line). Eventually we ended up with a pretty accurate fit.</p>
<p><img src="../images/GradientDescent/final.png" alt="Data plot" /></p>
<h2 id="tips-for-gradient-descent">Tips for Gradient Descent</h2>
<p>This section lists some tips and tricks for getting the most out of the gradient descent algorithm for machine learning.</p>
<ul>
<li>Plot Cost versus Time: Collect and plot the cost values calculated by the algorithm each iteration. The expectation for a well performing gradient descent run is a decrease in cost each iteration. If it does not decrease, try reducing your learning rate.</li>
<li>Learning Rate: The learning rate value is a small real value such as 0.1, 0.001 or 0.0001. Try different values for your problem and see which works best.
Rescale Inputs: The algorithm will reach the minimum cost faster if the shape of the cost function is not skewed and distorted. You can achieved this by rescaling all of the input variables (X) to the same range, such as [0, 1] or [-1, 1].</li>
<li>Few Passes: Stochastic gradient descent often does not need more than 1-to-10 passes through the training dataset to converge on good or good enough coefficients.</li>
<li>Plot Mean Cost: The updates for each training dataset instance can result in a noisy plot of cost over time when using stochastic gradient descent. Taking the average over 10, 100, or 1000 updates can give you a better idea of the learning trend for the algorithm.</li>
</ul>
<p>Code for this can be found <a href="https://github.com/amanullahtariq/MLAlgorithm/blob/master/Challenge/LinearRegression/GradientDescent/GradientDescent.ipynb">here</a></p>
<h1 id="reference">Reference</h1>
<ul>
<li><a href="http://www.dummies.com/education/math/calculus/how-to-use-a-partial-derivative-to-measure-a-slope-in-three-dimensions/">HOW TO USE A PARTIAL DERIVATIVE TO MEASURE A SLOPE IN THREE DIMENSIONS</a></li>
<li><a href="http://mathinsight.org/image/partial_derivative_as_slope">Partial derivative as slope</a></li>
<li><a href="http://machinelearningmastery.com/gradient-descent-for-machine-learning/">Gradient Descent For Machine Learning</a></li>
<li><a href="https://www.quora.com/What-is-an-intuitive-explanation-of-gradient-descent">Quora: Gradient Descent</a></li>
<li><a href="https://spin.atomicobject.com/2014/06/24/gradient-descent-linear-regression/">An Introduction to Gradient Descent and Linear Regression</a></li>
<li><a href="http://www.theactuary.com/features/2013/06/the-art-of-var-optimisation/">The art of VaR optimisation</a></li>
<li><a href="https://github.com/llSourcell/linear_regression_live">linear_regression_live by Siraj Raval</a></li>
</ul>Amanullah TariqGradient descent is one of the basic algorithm in machine learning, to master machine learning and deep learning implementing gradient descent is must. In this post, we write a simple code to implement Linear Regression using gradient descent.Gender Classification2017-01-22T00:00:00+00:002017-01-22T00:00:00+00:00http://amanullahtariq.com//male-and-female-classification<p>In this we will build a Gender Classifer, 10 lines script that will classify anyone as Male and Female given just there body measurements i.e Height, Weight and Size. For this we will use sklearn and apply four different classifier on our given data and later on compare there score and accuracy to select best algorithm.</p>
<h2 id="classification">Classification</h2>
<p><img src="../images/classification/classfication.png" alt="Classification Approach" /></p>
<p>In Machine Learning, classificaiton is a way of identifying to which category a new population belong. Let’s take the example of classifying apple and orange, we have to identify for the new element if it belongs to orange family or apple category. This process is called as <strong>Classification</strong>.</p>
<p><img src="../images/classification/appleorange.jpg" alt="Orange And Apple Classification Approach" /></p>
<h4 id="examples-of-classification">Examples of Classification</h4>
<p>Few other example of Classification are :</p>
<ul>
<li>Text Categorization (e.g. Spam Filtering)</li>
<li>Classification of Apple and Oranges</li>
<li>Fruad Detection</li>
<li>Face Detection</li>
<li>Optical Character Recognition</li>
<li>Natural Language Processing</li>
</ul>
<p><img src="../images/classification/classfication1.png" alt="Classification Approach" /></p>
<p>The classifer used for the problem are:</p>
<ul>
<li>Decision Tree Classifier</li>
<li>KNeighbors Classifier</li>
<li>Guassian Process Classifier</li>
<li>Random Forest Classifier</li>
<li>Ada-Boost Classifier</li>
</ul>
<h2 id="so-lets-start">So Let’s Start</h2>
<p>Implementation is been done in python. Libraries used here are:</p>
<ul>
<li>numpy</li>
<li>sklearn</li>
<li>matplotlib</li>
</ul>
<p>For this problem we generated data manually, data has 4 variables, 3 inputs <strong>Weight, Height and Size</strong> and 1 output variable **</p>
<h4 id="1-first-load-all-the-usefull-libraries">1) First load all the usefull libraries</h4>
<div class="highlighter-rouge"><pre class="highlight"><code># Import all the libraries here
import numpy as np
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.metrics import accuracy_score
</code></pre>
</div>
<h4 id="2-initialize-all-the-classifers">2) Initialize all the Classifers</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>decisionClf = DecisionTreeClassifier()
knnClf = KNeighborsClassifier()
gpcClf = GaussianProcessClassifier()
rpcClf = RandomForestClassifier(bootstrap=True)
adaBoostClf = AdaBoostClassifier()
</code></pre>
</div>
<h4 id="3-import-and-visualize-data">3) Import and visualize data</h4>
<p>We generate two type of data, one for training the classifer and other prediction. X and Y are the training data nd test_X and test_Y will be used for prediction and checking accuracy score.</p>
<div class="highlighter-rouge"><pre class="highlight"><code># [height, weight, shoe_size]
X = [[181, 80, 44], [177, 70, 43], [160, 60, 38], [154, 54, 37], [166, 65, 40],
[190, 90, 47], [175, 64, 39],
[177, 70, 40], [159, 55, 37], [171, 75, 42], [181, 85, 43]]
Y = ['male', 'male', 'female', 'female', 'male', 'male', 'female', 'female',
'female', 'male', 'male']
#TEST DATA[height, weight, shoe_size]
test_X = [[179, 90, 44], [190, 88, 44], [165, 55, 37], [160, 60, 39], [156, 56, 36], [181, 85, 43], [174, 66, 40],
[177, 70, 43], [159, 66, 47], [188, 100, 44], [179, 84, 47]]
test_Y = ['male', 'male', 'female', 'female', 'male', 'male', 'female', 'female', 'female', 'male', 'male']
</code></pre>
</div>
<h4 id="4-classification">4) Classification</h4>
<p>We used 5 classifier for this example but feel free to use any of the other classifier, there are lots of classifier which you can used.</p>
<h4 id="a-decision-tree-classifier">a) Decision Tree Classifier</h4>
<p>Decision tree learning uses a decision tree as a predictive model which maps observations about an item (represented in the branches) to conclusions about the item’s target value (represented in the leaves). It is one of the simplest classifier, since we all are fimilarize little bit from coding consider <strong>Decison Trees</strong> as the set of if and else conditions.</p>
<p><img src="../images/classification/decisiontree.png" alt="Classification Approach" /></p>
<div class="highlighter-rouge"><pre class="highlight"><code>decisionClf = decisionClf.fit(X, Y)
prediction = decisionClf.predict(test_X)
# Explained variance score: 1 is perfect prediction
print('Decision Tree Classifier')
print('Score: %.2f ' % accuracy_score(test_Y, prediction))
print('Variance score: %.2f' % decisionClf.score(test_X, test_Y))
</code></pre>
</div>
<p>Output</p>
<div class="highlighter-rouge"><pre class="highlight"><code>Decision Tree Classifier
Score: 0.64
Variance score: 0.64
</code></pre>
</div>
<h4 id="b-kneighbors-classifier">b) KNeighbors Classifier</h4>
<p><strong>KNeighbors Classfier</strong> is also an example of supervised learning like <a href="http://amanullahtariq.com/applying_linear_regression/">Linear Regression</a>, which we have already discussed last week.
To learn more about KNeighbors Classifer <a href="https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm">visit here</a>.</p>
<p><img src="../images/classification/knnClassification.png" alt="Classification Approach" />.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>knnClf = knnClf.fit(X, Y)
prediction = knnClf.predict(test_X)
print('KNeighbors Classifier')
print('Score: %.2f ' % accuracy_score(test_Y, prediction))
print('Variance score: %.2f' % knnClf.score(test_X, test_Y))
</code></pre>
</div>
<p>Output</p>
<div class="highlighter-rouge"><pre class="highlight"><code>KNeighbors Classifier
Score: 0.73
Variance score: 0.73
</code></pre>
</div>
<h4 id="c-guassian-process-classifier">c) Guassian Process Classifier</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>gpcClf = gpcClf.fit(X, Y)
prediction = gpcClf.predict(test_X)
print('Guassian Process Classifier')
print( 'Score: %.2f ' % accuracy_score(test_Y, prediction))
print('Variance score: %.2f' % gpcClf.score(test_X, test_Y))
</code></pre>
</div>
<p>Output</p>
<div class="highlighter-rouge"><pre class="highlight"><code>Guassian Process Classifier
Score: 0.73
Variance score: 0.73
</code></pre>
</div>
<h4 id="d-random-forest-classifier">d) Random Forest Classifier</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>rpcClf = rpcClf.fit(X, Y)
prediction = rpcClf.predict(test_X)
# Explained variance score: 1 is perfect prediction
print('Random Forest Classifier')
print( 'Score: %.2f ' % accuracy_score(test_Y, prediction))
print('Variance score: %.2f' % rpcClf.score(test_X, test_Y))
</code></pre>
</div>
<p>Output</p>
<div class="highlighter-rouge"><pre class="highlight"><code>Random Forest Classifier
Score: 0.82
Variance score: 0.82
</code></pre>
</div>
<h4 id="e-ada-boost-classifier">e) Ada-Boost Classifier</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>adaBoostClf = adaBoostClf.fit(X, Y)
prediction = adaBoostClf.predict(test_X)
# Explained variance score: 1 is perfect prediction
print('Ada-Boost Classifier')
print( 'Score: %.2f ' % accuracy_score(test_Y, prediction))
print('Variance score: %.2f' % adaBoostClf.score(test_X, test_Y))
</code></pre>
</div>
<p>Output</p>
<div class="highlighter-rouge"><pre class="highlight"><code>Ada-Boost Classifier
Score: 0.73
Variance score: 0.73
</code></pre>
</div>
<h4 id="5-result">5) Result</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>Random Forest Classifier
Score: 0.82
Variance score: 0.82
Ada-Boost Classifier
Score: 0.73
Variance score: 0.73
Guassian Process Classifier
Score: 0.73
Variance score: 0.73
Decision Tree Classifier
Score: 0.64
Variance score: 0.64
KNeighbors Classifier
Score: 0.73
Variance score: 0.73
</code></pre>
</div>
<h2 id="summary">Summary</h2>
<p>The best score of <strong>Random Forest Classfier</strong> is 0.82 and worst case score is 0.63, this is because random forest classifier in order to improve the predictive accuracy and over-fitting it mean prediction (regression) of multiple trees and thus each time we get score different since we don’t have enough data. On the otherhand, all the other Classifier are getting same score so for the above example we can use any classifier if you want to improve the results you must increase the amount of data. Since, here we are generated data manually but if you want that these classifier give you more accurate result than you must use more data this will allow the above algorithm to generalize its learns parameter more.</p>
<p>Source code can be <a href="https://github.com/amanullahtariq/MLAlgorithm/blob/master/Challenge/GenderClassification/Classification.ipynb">found here</a></p>
<h3 id="to-read-more-on-machine-learning-check-out">To read more on Machine Learning, check out</h3>
<ul>
<li>
<h4 id="linear-regression-challenge"><a href="http://amanullahtariq.com/applying_linear_regression/">Linear Regression Challenge</a></h4>
</li>
<li>
<h4 id="global-co2-emission"><a href="http://amanullahtariq.com/global-co2-emission">Global CO2 Emission</a></h4>
</li>
<li>
<h4 id="machine-learning-blog"><a href="http://amanullahtariq.com/">Machine Learning Blog</a></h4>
</li>
</ul>
<h2 id="reference">Reference</h2>
<ul>
<li><a href="http://blog.echen.me/2011/04/27/choosing-a-machine-learning-classifier/">Choosing a Machine Learning Classifier</a></li>
<li><a href="https://en.wikipedia.org/wiki/Statistical_classification">Wiki: Statistical Classification</a></li>
<li><a href="https://github.com/amanullahtariq/MLAlgorithm/tree/master/Challenge/GenderClassification">Github: Gender Classification </a></li>
<li><a href="https://en.wikipedia.org/wiki/Decision_tree_learning">Wiki: Decison Tree Classifer</a></li>
</ul>Amanullah TariqIn this we will build a Gender Classifer, 10 lines script that will classify anyone as Male and Female given just there body measurements i.e Height, Weight and Size. For this we will use sklearn and apply four different classifier on our given data and later on compare there score and accuracy to select best algorithm.Linear Regression Challenge2017-01-20T00:00:00+00:002017-01-20T00:00:00+00:00http://amanullahtariq.com//applying_linear_regression<p><strong>Linear regression</strong> is a supervised learning algorithm to predict data for future. In this post, we will apply Linear regression on the data-set provided as a challenge in the video: <strong><a href="https://www.youtube.com/watch?v=vOppzHpvTiQ">How to Make a Prediction - Intro to Deep Learning #1</a></strong> created by Siraj Rabal on Youtube.</p>
<h3 id="click-here-to-see-manual-implementation-of-linear-regression"><strong><a href="https://github.com/amanullahtariq/MLAlgorithm/tree/master/Challenge/LinearRegression/GradientDescent">Click here to see manual implementation of Linear Regression</a></strong></h3>
<p>In traditional, programming approach we define every single step to complete our task.</p>
<p><img src="../images/programming.png" alt="Programming Approach" /></p>
<p><strong>Machine learning</strong> allow us to ommit the old approach of coding. So, instead of writting sequence of steps to complete our task. So we provide the outcome and program learns to find an optimal way to reach that goal. For example, we have a Robot which want to predict if the thing in his hand is Apple or Orange. First we train our program by providing it the data sample and once our robot is train we can give it a random object and ask him to predict if this is an apple or not.</p>
<p><img src="../images/ml-example.png" alt="Machine Learning Example" /></p>
<p><strong>Machine learning</strong> in general, is been divided into three categories. We will not go in detail of those in this post but for the general knowledge there are three main categories of ML.</p>
<ul>
<li><a href="https://www.youtube.com/watch?v=nPFnlua2Y5Q">Supervised Learning</a></li>
<li><a href="https://www.youtube.com/watch?v=nPFnlua2Y5Q">Unsupervised Learning</a></li>
<li><a href="https://www.youtube.com/watch?v=e3Jy2vShroE">Reinforcement Learning</a></li>
</ul>
<p><strong>Linear regression</strong> is a supervised learning algorithm to predict data for future. Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using a linear regression model.</p>
<p>To get more detail about linear regression check out this <a href="https://www.youtube.com/watch?v=vOppzHpvTiQ&t=4s">video</a></p>
<h2 id="challenge">Challenge</h2>
<ul>
<li>
<p>The challenge for this post is to use scikit-learn to create a line of best fit for the included <strong>‘challenge_dataset’</strong>. Then, make a prediction for an existing data point and see how close it matches up to the actual value. Print out the error you get. You can use scikit-learn’s <a href="http://scikit-learn.org/stable/documentation.html">documentation</a> for more help.</p>
</li>
<li>
<p>Bonus points if you perform linear regression on a dataset with 3 different variables.</p>
</li>
</ul>
<h2 id="requirements">Requirements</h2>
<ul>
<li><a href="http://jupyter.org/install.html">Jupyter</a></li>
<li><a href="https://www.python.org/">Python</a></li>
</ul>
<h2 id="code">Code</h2>
<p>Implementation is been done in python. Libraries used here are:</p>
<ul>
<li>pandas</li>
<li>numpy</li>
<li>sklearn</li>
<li>matplotlib</li>
</ul>
<h4 id="challenge-1-using-linear-regression-on-challenge_dataset">Challenge 1: Using Linear Regression on <strong>‘challenge_dataset’</strong></h4>
<p>First load all the libraries</p>
<div class="highlighter-rouge"><pre class="highlight"><code>import pandas as pd
import numpy as np
from sklearn import linear_model as model
import matplotlib.pyplot as plt
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>#read data from the challenge_dataset
dataframe = pd.read_csv('input/challenge_dataset.txt')
x_values = dataframe[[0]]
y_values = dataframe[[1]]
#train model on data
regr = model.LinearRegression()
regr.fit(x_values, y_values)
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>#train model on data
regr = model.LinearRegression()
regr.fit(x_values, y_values)
</code></pre>
</div>
<h4 id="mean-square-error">Mean Square Error</h4>
<div class="highlighter-rouge"><pre class="highlight"><code># The coefficients
print('Coefficients: ', alg.coef_)
# The mean squared error
print('Mean squared error: %.2f ' % np.mean((regr.predict(x_values) - y_values) ** 2))
# Explained variance score: 1 is perfect prediction
print('Variance score: %.2f' % regr.score(x_values, y_values))
</code></pre>
</div>
<p><img src="../images/results/challenge-result.png" alt="Result of Challenge Data-Set" /></p>
<h4 id="visualization">Visualization</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>#Visualize Results
plt.scatter(x_values, y_values)
plt.plot(x_values, regr.predict(x_values))
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Challenge Dataset')
plt.show()
</code></pre>
</div>
<p><img src="../images/results/challenge.png" alt="Result of Challenge Data-Set" /></p>
<h4 id="challenge-2-linear-regression-on-a-dataset-with-3-different-variables">Challenge 2: Linear regression on a dataset with 3 different variables.</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>import matplotlib.pyplot as plt## **[Click here to see manual implementation of Linear Regression]
from sklearn import linear_model
from sklearn import datasets
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import numpy as np
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code># Loading data
#read data from the challenge_dataset
iris = datasets.load_iris()
# for the above bonus, consider only 3 different vaiables
# i.e. two are input variable and one is output variable
x_values = iris.data[:,1:3]
y_values = iris.target
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>#train model on data
regr = model.LinearRegression()
regr.fit(x_values, y_values)
</code></pre>
</div>
<h3 id="mean-square-error-1">Mean Square Error</h3>
<div class="highlighter-rouge"><pre class="highlight"><code># The coefficients
print('Coefficients: ', alg.coef_)
# The mean squared error
print('Mean squared error: %.2f ' % np.mean((regr.predict(x_values) - y_values) ** 2))
# Explained variance score: 1 is perfect prediction
print('Variance score: %.2f' % regr.score(x_values, y_values))
</code></pre>
</div>
<p><img src="../images/results/bonus-result.png" alt="Result of Challenge Data-Set" /></p>
<h4 id="visualization-1">Visualization</h4>
<div class="highlighter-rouge"><pre class="highlight"><code>#Visualize Results
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x_values[:,0],x_values[:,1], y_values, c='g', marker= 'o')
#ax.scatter(x_values[:,0],x_values[:,1], regr.predict(x_values), c='r', marker= 'o')
ax.plot_surface(x_values[:,0],x_values[:,1], z_surf, cmap=cm.hot, color='r', alpha=0.2);
ax.set_xlabel('Sepal Length')
ax.set_ylabel('Sepal Width')
ax.set_zlabel('Species')
ax.set_title('Orignal Dataset')
plt.show()
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x_values[:,0],x_values[:,1], regr.predict(x_values), c='r', marker= 'o')
ax.plot_surface(x_values[:,0],x_values[:,1], regr.predict(x_values), cmap=cm.hot, color='b', alpha=0.2);
ax.set_xlabel('Sepal Length')
ax.set_ylabel('Sepal Width')
ax.set_zlabel('Species')
ax.set_title('Predicted Dataset')
plt.show()
</code></pre>
</div>
<p><img src="../images/results/bonus.png" alt="Result of Challenge Data-Set" />
<img src="../images/results/bonus1.png" alt="Result of Challenge Data-Set" /></p>
<p>Source code can be <a href="https://github.com/amanullahtariq/MLAlgorithm/tree/master/Challenge/LinearRegression/Challenge.ipynb">found here</a>.</p>
<h2 id="click-here-to-see-manual-implementation-of-linear-regression-1"><strong><a href="https://github.com/amanullahtariq/MLAlgorithm/tree/master/Challenge/LinearRegression/GradientDescent">Click here to see manual implementation of Linear Regression</a></strong></h2>
<h2 id="reference">Reference</h2>
<ul>
<li><a href="http://onlinestatbook.com/2/regression/intro.html">Intro to Linear Regression</a></li>
<li><a href="http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html">Sckit-learn LinearRegression</a></li>
<li><a href="http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm">Linear regression</a></li>
<li><a href="https://onlinecourses.science.psu.edu/stat501/node/250">Simple Linear Regression</a></li>
</ul>Amanullah TariqLinear regression is a supervised learning algorithm to predict data for future. In this post, we will apply Linear regression on the data-set provided as a challenge in the video: How to Make a Prediction - Intro to Deep Learning #1 created by Siraj Rabal on Youtube.Hello World2017-01-08T00:00:00+00:002017-01-08T00:00:00+00:00http://amanullahtariq.com//Welcome<p><strong>Hello World!</strong> This is my first post, I personally <strong>Welcome</strong> you on my personal blog. The main purpose of this blog is to share the knowledge of Machine Learning and data sciences with you guys.</p>
<p>In this post, I will just give you a short introduction about me, what is this blog about and why I started this blog.Every Week,I will try to write one article about recent work I have been doing in the field of Machine Learning so you guys can also take a look at it.</p>
<p>I have always been passionate about using data to tell a good story. Back in my software career, I loved coming up with software recommendations based on industry trends and company specific data. I moved to Germany in 2014 where I got the opportunity to learn from some of the most brilliant data scientists and machine learning experts. At the same time, I had a chance to witness how the company applied machine learning to bring tremendous value. As a result, I became very inspired to use data to build great products and solve problems that make a difference to people’s life!</p>
<h3 id="lets-start-my-first-post">Let’s Start My First Post</h3>
<p>I am Amanullah Tariq and currently doing Masters In Computer Science from University Of Freiburg where my major is Machine Learning.</p>
<p>I did my Bachelors in Computer Science in the year 2011. In 2012, I started my career as a Software Engineer and for 2.5 years of my professional experience I worked on several technologies like PHP, C# , ASP.NET, MVC to develop Softwares as well as Web-based applications. After gathering some money I decided to purse my Masters degree from Germany in the field of Machine Learning and Artifical Intelligence.</p>
<p>So, In 2014 I started my Masters degree and I become fascinated with the power and capabilites of Machine Learning, Computer Vision and Artifical Intelligence. In last two years I took several courses related to Machine Learning and Artifical Intelligence to get good grasp on it.</p>
<p>Additionally, I also took lab courses where I used C++, Point Cloud Library, OpenCV and Supervised learning to <a href="http://www.slideshare.net/AmanullahTariq/parking-space-detect-70787242">detect free parking spot</a> from 2D images provided by google map and 3D data and I also participated in <a href="http://www.slideshare.net/AmanullahTariq/daedalus-aadc2016">Audi Autonomous Driving Cup 2016</a> where I worked on Lane detection, Crossroad detection using C++, OpenCV and Caffe and used leap motion to controlled the car using different hand gestures through TCP protocol for Open Challenge.</p>
<p>Although this video is not from the final days or competition but it quite sum up what we where doing.
<a href="https://www.youtube.com/watch?v=vpVAawVMVIE"><img src="https://img.youtube.com/vi/vpVAawVMVIE/0.jpg" alt="Watch Video" /></a></p>
<p>Although, I am not a expert of this field I am still a student like you guys but I think the way you learn more is by doing experiments and learning from other people who are going from the same phase by which you are going. If anybody wants to contribute to this blog you are always welcome.</p>
<p>I know you all guys must be thinking why I started this blog now as I have been working in this field for quite a long time. Honestly, previously I always thought writting post is the boring thing to do in the World, sorry if it hurts somebody, but its what I thought.</p>
<p>Anyways, I am very motivated to share my work with you guys and also very anxious to start my blog. See you guys next week.
You guys can follow me on <a href="https://github.com/amanullahtariq">Github</a> or checkout my work on <a href="https://de.linkedin.com/in/amanullah-tariq-60a0b822">Linkedin</a></p>Amanullah TariqHello World! This is my first post, I personally Welcome you on my personal blog. The main purpose of this blog is to share the knowledge of Machine Learning and data sciences with you guys.