Project: Personal website

I went to a couple of events in the month of november. There I got a chance to talk to amazing people who have good experience in the industry. As I’m about to start my “official” professional career I wanted to get some insights about how companies work and what are the technologies are being used. To my surprise 75% of people talked about angular. How an amazing technology it is and etc. So, Naturally I wanted to learn that as well !

Many people in the events talked about various online courses. I learned a lot of things on internet but never bought a course ! I love free stuff ! But people suggested that some paid courses are worthful. As angular2 is pretty new software there were no good free courses online. At the point of time I saw a ad on facebook that there is a black friday sale in udemy. I took no time to grab the offer and bought a angular2 course just for 14$.

I must say the course is really good. All the topics are carefully chosen and are explained. Therefore I’m learning the technology and at the same time building a personal website project. To actually understand angular2 fully I will surely build at least couple of projects.

Whole angular2 revolves around components more precisely directives. Therefore initially I have thought of components such as about, contact, projects, skills.

Project as of now: I have completed 4 sections in the udemy course. I have learnt till databinding, property binding, event binding, components.

components :









So, Clearly a long way to go ! But I guarantee that learning and mastering this technology will be a great asset to my career. And by the way I’m loving angular2. 

See you in next!


Beginning ML: What next?

Machine learning is a branch of artificial intelligence and deals with lots and lots of data. In our neural network models we have used MNIST dataset which was pre processed and can be directly used by tensorflow. And we also have used a raw dataset that we had to pre process to make use for our model. Clearly one need to understand and analyse data ( In fact large amounts of data ) .

Data Analysis is the branch that deals with data and helps us to understand data and use to draw conclusions. Therefore in machine learning data is critical part  and we have to learn to analyse data.

Next posts we will look how we can use python to perform data analysis. I’m looking up few courses online to learn data analysis .

Let’s catch up in next!

Beginning ML: Movie Review Sentiment Analyser cont. :KnowDev

In last post we have used pandas to extract raw data from .csv files and used bag of words model to pre process our data into feature sets.

In this post we will train the model. It’s most simplest thing. We will use RandomForests to predict. Random forest is a collection of decision trees.

First we initialize forest with 100 decision trees.

forest = RandomForestClassifier(n_estimators=100)

We will use fit function in forest variable to build a forest of trees from training set.

forest =, train[‘sentiment’])

trained_cleaned_data is the pre processed data from our last post. train[‘sentiment’] is the labels for all the data corresponding to X. And we are done with training our model.

Now, we can test and predict using our model.

To test we have first transform the test raw data into required format. We will use transform while testing because to avoid over-fitting.

    test_data_features = vectorizer.transform(clean_test_reviews)

Then we will simply predict using predict function of forest variable.

    result = forest.predict(test_data_features)

We will finish off our testing by simply loading all the predictions to a file for permanent storage. And that’s it we have used a new model and a new technique to build a sentiment analyzer. This model is not a perfect one for commercial use because one, we did not use a large dataset and also we did not use a more sophisticated model. In up coming posts we  will see what are those “sophisticated” techniques or models. I’m sure those concepts will be much more interesting, with that I’ll see you soon!

Complete source code here

Beginning ML – Movie Review Analysis: KnowDev

Till the last post we have seen methods of building a sentiment analyzer using multi-layer feedforward neural network. In fact in this post also we will build sentiment analyzer which can predict positiveness or negativeness of a movie review, We can consider this as one of the user case of what we learned so far.

This particular concept is divided into 2 parts. One, Pre processing our data. Two, Using random forest technique to predict.

Pre-Processing :

We will use pandas module to extract data from a csv file. As we did before we will use bag of words model to create feature sets. But before we have to clear little dirt like html tags (using beautifulsoup module), removing punctuations, and removing stopwords . StopWords  are the words like the, and, an, is etc which do not add any specific emotion to the sentence. We are removing punctuations as well to just remove the complexity, once we get quite familiar with what we are doing we add more complexities to our model. We will implement all this functionality in function clean_text.

Now we have to apply these modifications to all the reviews in our file. We call that function as create_clean_train. This function might take couple of minutes because there are almost 25000 reviews all together.

We will create feature sets using CountVectorizer from scikit learn.

In next, we will complete building our movie review sentiment analyser. See you next!

Complete source code: here

Beginning ML: Sentiment Analysis Using Textblob : KnowDev

In the last post we have build a neural network for sentiment analysis. We have used our own dataset which was not pretty big enough. Indeed we were able to achieve accuracy of 54%. Today we shall be using a module of python for sentiment analysis. We shall be building twitter sentiment analyzer ! believe me you’ll be amazed by how easily we can achieve it !

First we need to install 2 modules, tweepy, which allows us to make API calls to twitter. We have to create a app in twitter developer to actually authenticate ourselves. Next,  we need textblob which can perform sentiment analysis. Textblob can actually perform many more operations apart from sentiment analysis. If you are you can check out here.

Let’s import our dependencies

import tweepy
from textblob import TextBlob

We have to declare 4 variables, consumer_key, consumer_secret, access_token, access_token_secret all these can be found after we create app in twitter developer site.

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

We can authenticate ourselves by above 2 lines. We are almost done with the authentication.

api = tweepy.API(auth)

Through api variable we can use search operation to find public tweets.

public_tweet ='search')

search is the key word we will finding for. Now we can iterate through public_tweets and use textblob to perform sentiment analysis on the tweet.

for tweet in public_tweet:
    T = tweet.text
    analysis = TextBlob(tweet.text)
    sentiment = analysis.sentiment.polarity
    print T, sentiment

And that’s it ! We have successfully using tweepy and textblob modules to build a twitter sentiment analyzer in less than 25 lines. In fact there are many more sources from which we can use API.

This is a relatively small post and you know why ! Now you can use sentiment analyzer for wide range of use cases and I’ll see you in next !

Complete source code

Beginning ML – Sentiment Analysis Using Neural Network cont. : KnowDev

This post is a continuation from this .

I hope you have got a good understanding why we have to pre-process. In this post we shall train our model and also input our own sentences.

First of all we shall get our feature sets that we have created either from pickle or call the function to store into a variable.

from create_sentiment_featuresets import create_feature_sets_and_labels
train_x, train_y, test_x, test_y = create_feature_sets_and_labels('pos.txt', 'neg.txt')

We will be using the same neural network model that we used here. First we have define our placeholder for features.

x = tf.placeholder('float', [None, len(train_x[0])])
y = tf.placeholder('float')

len( train_x[0] ) returns the length the features.

The neural network model is define using neural_network_model function. After the neural network is defined it’s time to train our model.

First we’ll capture the prediction / output of neural network using

prediction = neural_network_model(x)

Then, we have to find the cross entropy of the prediction made by our model. We are using softmax regresstion.

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
                                                      prediction, y))

After finding the cross entropy is time to back propagate and try reduce the difference.

optimizer = tf.train.AdadeltaOptimizer(0.5).minimize(cross_entropy)

both #1 and #2 makes the training step. We’ll start session and using number of epochs as 10.

The accuracy we could achieve was 55.44


The trained model is saved into ‘sentiment_model.ckpt’, later we can use that to restore our variables ( i.e weights and biases ) to use.

Making predictions :

To make predictions using our model that we have just trained we have to preprocess our input sentence so that can be passed as features to our model. After we prepocess our input sentence we predict.

result = (
feed_dict={x: [features[:423]]}), 1)))

we print out whether the output is positive or negative using

if result[0] == 0:
    print('Positive:', input_data)
elif result[0] == 1:
    print('Negative:', input_data


As you can see our model makes pretty good prediction even though the accuracy is 54% .

In this post we have seen how we can train our own data as well as use it. In less than a week time we are able to make a machine which can predict the sentiment of any sentence pretty interesting right ? In next post I will introduce you to more sophisticated version of sentiment analysis. See you in next !

link to complete source code :  here

Beginning ML – Sentiment Analysis Using Deep Neural Network: KnowDev

In last post we have implemented our first neural network which can classify a set a images. In fact, That experiment can be considered as HELLO WORLD program. There is lot more we have consider while implementing our model. Mainly data ! lot of times data is very raw and we are required to perform some kind of preprocessing so that the data is in format that tensorflow objects can accept. Sentiment Analyser is a program which can tell whether the given sentence is positive or negative. We will be using the same neural network model that we have build in last post. For better understanding purpose the whole process of building of sentiment analyzer is divided into parts.

In this post, we shall be looking on how to get raw data and convert into required format. Both positive dataset and negative dataset is available in GIT link. First we’ll download the datasets into our directory. Both datasets contains 5000 sentences each. Yes! the data we got is not really enough for practical purposes.

Once we have got our datasets ready in our directory. Import tensorflow(duh!) we shall be creating feature sets from the data.

First of all we have create our vocabulary of words.The model we will is bag of words. We will call this collection words as lexicon. We will be using nltk library to extract words which are most relevant. The technique we are using is stemming and lemmatizing.

lexicon = [lemmatizer.lemmatize(i) for i in lexicon]

lexicon in the LHS just contains all the words from pos.txt and neg.txt (our datasets). In fact we can  employ other techniques such as removing stop words (like the, an, a..) which have no particular effect on the sentiment of the sentence. We are kind of removing  those words by considering only words of frequency more than 1000.

for w in w_counts:
    if 1000 > w_counts[w] > 50:
        l2.append(w) # l2- final lexicon list

Now as if have created our vocabulary we can create out features. Here our lexicon size is 423. A tensor accepts a object of floats but the sentences we have in string. Hence we have to use our lexicon that we have created earlier to make a vector which contains the frequency of words in the sentence.

for example, lexicon = [‘dog’, ‘cat’, ‘eat’, fight’, ‘food’] and the given sentence is ” dog fights with cat for food “. Therefore the feature set is [1, 1, 0, 1, 1].

We create a list of list of features and classification. Positive is denoted as [1, 0] and negative as [0, 1]. 

features = list(features)
featureset.append([features, classification])

Finally we’ll create our collection of featureset of both positive and negative. The list shuffled so that the neural network can converge.

features += sample_handling('pos.txt', lexicon, [1, 0])
features += sample_handling('neg.txt', lexicon, [0, 1])

Now the whole set is divided training data and testing data.

train_x = list(features[:, 0][:-testing_size])
train_y = list(features[:, 1][:-testing_size])

test_x = list(features[:, 0][-testing_size:])
test_y = list(features[:, 1][-testing_size:])

train_x and test_x are the features and train_y and test_y are the labels. We will be using pickle module for permanent storage of these values so that they can be used later for training our neural network.

In this post we have downloaded our own data and cleaned to our requirements as well as dividing our cleaned data into training data and testing data.

In next post we will be using this data to train our model and test to find accuracy and also run the model against our own inputs ! Awesome right ? I’m excited too…

See you in next !

link to complete source code :

next post : next