Am; YourMove: How a 280 year old algorithm is used in a year old application? | tech talk

he the post will be updated soon…

This post is a complete description about the video post on 15th May, 17 in AM; YourMove .

If you are a computer science student or if you have ever have looked various algorithms. Devising an algorithm can be really hard, but a well-optimized algorithm can always be appreciated. Classic algorithms never go out of fashion in fact In computer science most oof the systems are still working fine under those algorithms it self. Algorithms are answers or solutions to real world problem stated in the most interesting problem statement. I personally get excited when I see a great algorithm and always want to share the same knowledge and excitement.

But, Nowadays due boom of application development algorithm engineering is not really focused topic by engineering students. But we never know when a oldie-but-goodie algorithm might come in handy. This post focuses on a Google’s application Google Trips, Which used a 280-year-old algorithm!

If you do not know what is Google Trips where we can plan our vacation. One of the features, we can pre-plan which places we are interested in visiting and google creates a optimal route with minimum travel time. Google also takes care of your interests, timings and other important parameters. This is termed as ” itineraries”. 

The problem statement of the application is very similar to that of travelling salesmen problem. It states that,

Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?

But Travelling salesmen problem belongs to the class of NP-hard. NP-hard problems are those problems which cannot be solved in deterministic time. That means they cannot be solved in polynomial time. so, a TSP cannot be implemented and if so we cannot expect a great user experience! Which is, in fact, a real deal for a company like Google.

But TSP problem can be solved by making some assumptions and approximations. The rest of the post talks about how the problem of creating itineraries is solved by using the discoveries of some great scientists.

In 1736, Leonhard Euler,  studied the following question: is it possible to walk through the city crossing each bridge exactly once?

—-add pic

As it turns out, for the city of Königsberg, the answer is no

Euler noticed that if all the nodes in the graph have an even number of edges (such graphs are called “Eulerian” in his honour) then, and only then, a cycle can be found that visits every edge exactly once. Keep this in mind, as we’ll rely on this fact later!

One of the approximation algorithms for TSP is an algorithm by Christofides. The assumption is the distances form a metric space (they are symmetric and obey the triangle inequality). It is an approximation algorithm that guarantees that its solutions will be within a factor of 3/2 of the optimal solution length.

This is an important part of the solution and builds on Euler’s paper.  here is a quick four-step rundown of how it works here:

  1. We start with all our destinations separate and repeatedly connect together the closest two that aren’t yet connected. This doesn’t yet give us an itinerary, but it does connect all the destinations via a minimum spanning tree of the graph.
  2. We take all the destinations that have an odd number of connections in this tree (Euler proved there must be an even number of these), and carefully pair them up.
  3. Because all the destinations now have an even number of edges, we’ve created a Eulerian graph, so we create a route that crosses each edge exactly once.
  4. We now have a great route, but it might visit some places more than once. No problem, we find any double visits and simply bypass them, going directly from the predecessor to the successor.

This is how a couple of algorithms and their discoveries are cleverly used to get a solution for their problem statement. I hope this post inspires you and motivates you to start looking into algorithms in a different perspective to solve real world problems.

And I’ll see you in the next one.

Sources: https://research.googleblog.com/2016/09/the-280-year-old-algorithm-inside.html

 

 

 
.

Advertisements

Am; YourMove: Visualize Sorting a Stack !

This blog summarization of the video from a YouTube channel AM; YourMove. This is the link to the video.

In this video, the presenter discussed how to sort a stack by just using basic stack operations like push, pop or peek etc. Other constraints on the problem are we can only use data structure stack. But it gives us the flexibility that we can use more than one stack (if we need).

It is one of the simple interview questions out there but the presenter made it more interesting by using a visualizing tool called processing IDE to show the viewers how the elements in the stack are pushed and popped. The video is attached at end of the post.

The presenter informs that the algorithm’s complexity is O(n2) which by no means sounds optimal but under these circumstances this best we can get. The algorithm is discussed below.

We need 2 stacks. The idea is to pull an item from the original stack and push it on the other stack. If pushing this item would violate the sort order of the new stack, we need to remove enough items from it so that it’s possible to push the new item. Since the items we removed are on the original stack, we’re back where we started.

code

Link to the visualization sample video (CLICK HERE) . You can check out the code HERE.

This was one of the interesting topics yet on AM; YourMove. Check out the channel here SUBSCRIBE NOW!

Am; YourMove : What is Load Balancer and Consistent Hashing #2

This is cont. Am; YourMove : What is Load Balancer and Consistent Hashing #1

In last post, we were introduced to concepts like hashing. In this post, we will learn in detail why “just” hashing will not work and why there is a need for an algorithm like consistent hashing. In specific, we will try to know more about the algorithm that is developed by google.

We now know that we have load balance the requests across multiple data centers. Consider an example for why classic hashing technique is not sufficient. If you have a collection of n cache machines then a common way of load balancing across them is to put object o in cache machine number hash(o) mod n. This works well until you add or remove cache machines (for whatever reason), for then n changes and every object is hashed to a new location.

This is why consistent hashing comes into the picture. It is interesting to note that it is only the client that needs to implement the consistent hashing algorithm – the memcached server is unchanged. Other systems that employ consistent hashing include Chord, which is a distributed hash table implementation, and Amazon’s Dynamo, which is a key-value store (not available outside Amazon).

The algorithm which we will be discussing in the post is called as  “consistent hashing with bounded loads”. The main aim for the algorithm is to achieve both uniformity and consistency in the resulting allocations.

We can think about the servers as bins and clients as balls.

 

The uniformity objective encourages all bins to have a load roughly equal to the average density (the number of balls divided by the number of bins). For some parameter ε, we set the capacity of each bin to either floor or ceiling of the average load times (1+ε). This extra capacity allows us to design an allocation algorithm that meets the consistency objective in addition to the uniformity property.

Imagine a given range of numbers overlaid on a circle. We apply a hash function to balls and a separate hash function to bins to obtain numbers in that range that correspond to positions on that circle. We then start allocating balls in a specific order independent of their hash values (let’s say based on their ID). Then each ball is moved clockwise and is assigned to the first bin with spare capacity.

Consider the example above where 6 balls and 3 bins are assigned using two separate hash functions to random locations on the circle. For the sake of this instance, assume the capacity of each bin is set to 2. We start allocating balls in the increasing order of their ID values. Ball number 1 moves clockwise and goes to bin C. Ball number 2 goes to A. Balls 3 and 4 go to bin B. Ball number 5 goes to bin C. Then ball number 6 moves clockwise and hits bin B first. However, bin B has capacity 2 and already contains balls 3 and 4. So ball 6 keeps moving to reach bin C but that bin is also full. Finally, ball 6 ends up in bin A that has a spare slot for it.

 

Capture

 

Upon any update in the system (ball or bin insertion/deletion), the allocation is recomputed to keep the uniformity objective. The art of the analysis is to show that a small update (a few number of insertions and deletions) results in minor changes in the state of the allocation and therefore the consistency objective is met. In the paper, its also show that every ball removal or insertion in the system results in O(1/ε2) movements of other balls.

This algorithm is just not theoretical! This is in fact implemented in one of the famous companies. Andrew Rodland from Vimeo had found the paper and used it for their load balancing project at Vimeo. The results were dramatic: applying these algorithmic ideas helped them decrease the cache bandwidth by a factor of almost 8, eliminating a scaling bottleneck. He recently summarized this story in a blog post detailing his use case.

Check out simple implementation of simple consistent hashing algorithm check –link

The algorithm is open-source, allowing anyone to use this algorithm. To get more insights like improvement statistics, performance analysis etc is found in the paper published. paper

This was one of the interesting topics yet on AM; YourMove. Check out the channel here SUBSCRIBE NOW!

And, I’ll see you in the next one !

Am; YourMove : What is Load Balancer and Consistent Hashing #1

link to the actual video

On 10th April 2017 a new video is released on the channel AM;YourMove.

In this video, the presenter took up an interesting topic. The topic for this tech talk is “ load balancing and consistent hashing ”. The presenter started off the video with a question. The question was how large companies like Google, Amazon or Facebook are managing to process a huge amount of requests !? To give you a perspective on how large this information is, Google search alone gets around 3.2 billion requests in a single day. Just imagine how many requests does Youtube get? (Largest video streaming platform in the world) or Facebook!

So, The presenter then explains that this is possible because the load is balanced across multiple data centers situated at multiple places. That is why we are often hearing that Facebook is installing a new data center. Let me explain what is a data center. A data center is a large group of networked servers used for remote storage, processing, or distribution of large amounts of data. Then the presenter explains why there is a need for setting up a new data center. The obvious reason is the data center is getting overloaded or if not overload, it could be the efficiency of that data center is decreased. For many reasons, a data center can only process a certain n number of requests at an instant of time.

So we know why there is a requirement for a new data center. Multiple data centers should be well in sync so that none gets overloaded. The presenter then answers that this distribution of requests across multiple data centers is done by Load Balancer. A load balancer is a software or hardware or both.

 

load-balancing-1

 

Load balancing aims to optimize resource use, maximize throughput, minimize response time, and avoid overload of any single resource. Any addition or removal of these resources should not affect the user experience. This is usually achieved with a shared database or an in-memory session database, for example, Memcached. So basically, a load balancer distributes clients uniformly across multiple servers such that none get overloaded.Further, it is desirable to find an allocation that does not change very much over time in a dynamic environment in which both clients and servers can be added or removed at any time. In other words, we need the allocation of clients to servers to be consistent over time.

But how this allocation of clients is possible? certainly, some algorithm is used, right ? The algorithm used is called consistent hashing.

But let us first understand what is hashing. Hashing is a technique to store the data. Data can be accessed faster through directly by index in the location which can be computed by a hash function.

static_hash

Hashing concept is used in some famous databases like Amazon’s dynamo. The hash function is used to index the original value or key and then used later each time the data associated with the value or key is to be retrieved. Thus, hashing is always a one-way operation. There’s no need to “reverse engineer” the hash function by analyzing the hashed values. In fact, the ideal hash function can’t be derived by such analysis. A good hash function also should not produce the same hash value from two different inputs. If it does, this is known as a collision. A hash function that offers an extremely low risk of collision may be considered acceptable.

The presenter then asks the viewers what could be the difference between hashing and consistent hashing. First, let’s understand why we need different algorithms. As the systems which we are dealing with are dynamic i.e that server or client can go offline at any point of time. So if any server goes down all the clients allocated to it should be re-allocated to other servers to provide a user-friendly experience.

Consider this, If you have a collection of n cache machines then a common way of load balancing across them is to put object o in cache machine number hash(o) mod n. This works well until you add or remove cache machines (for whatever reason), for then n changes and every object is hashed to a new location. This can be catastrophic since the originating content servers are swamped with requests from the cache machines. It’s as if the cache suddenly disappeared. It would be nice if, when a cache machine was added, it took its fair share of objects from all the other cache machines. Equally, when a cache machine was removed, it would be nice if its objects were shared among the remaining machines. This is exactly what consistent hashing does – consistently maps objects to the same cache machine, as far as is possible, at least.

Seems too technical or difficult? The presenter promises to make the concepts simple enough to be understood by a non-technical person. The presenter then informs that this is just part 1 of this video. In the next video, The actual algorithm of load balancing and how consistent hashing works will be explained. Stay tuned to the channel AM;YourMove.

If you haven’t already subscribed to the channel subscribe NOW!  It is a great channel to get the latest information related to computer science.

Subscribe here -> CLICK HERE!

AM; YourMove: Check if a Tree is a Subtree or not!

link to the actual video

This is the very first blog post of the video description of the of the youtube video released on 3rd April 2017.

The topic for the video is checking if a binary tree is a subtree or not. This video is released as part of video series on trees data structure. The video starts off with describing the problem statement. The definition of a subtree can be sometimes misunderstood therefore the presenter takes the help of Wikipedia to find out the actual definition of a subtree. According to Wikipedia,

    A subtree of a tree T is a tree consisting of a node in T and all of its descendants in 

Afterwards, The presenter right away starts with finding the possible solution for the problem. To solve the problem, The presenter coins concepts like tree traversals. Tree traversals are the algorithms which traverse the tree is some fashion to get the values of the node data. There are 3 very popular tree traversals.

  1. Pre-order
  2. Post-order
  3. In-order

The presenter goes on to explain these traversals.

Pre-Order :

  1. Check if the current node is empty/null.
  2. Display the data part of the root (or current node).
  3. Traverse the left subtree by recursively calling the pre-order function.
  4. Traverse the right subtree by recursively calling the pre-order function.

In-Order :

  1. Check if the current node is empty/null.
  2. Traverse the left subtree by recursively calling the in-order function.
  3. Display the data part of the root (or current node).
  4. Traverse the right subtree by recursively calling the in-order function.

Post-Order :

  1. Check if the current node is empty/null.
  2. Traverse the left subtree by recursively calling the post-order function.
  3. Traverse the right subtree by recursively calling the post-order function.
  4. Display the data part of the root (or current node).

 

After that, Presenter reveals that traversals define few interesting characteristics of a binary tree.

For example.,

  1. If the binary tree is BST then inorder traversal gives us ascending order of the values of the nodes.
  2. For any binary tree, pre/post and inorder uniquely identifies a tree.

2nd point is interesting and can be directly applied to solve our problem i.e to find whether a tree is a subtree of another tree or not. But how? The presenter explains that if a smaller tree’s pre and inorder elements are subarray of bigger tree’s pre and inorder elements respectively then we can confirm that tree is a subtree. The presenter starts implementing the actual algorithm.

 

// method to find if a tree is a subtree

bool BinaryTree::isSubtree(node *big, node *small) {

// data structures to collect the values of the nodes encounterd while traversing the tree

vector<int> preOrderBigCollector;
vector<int> inOrderBigCollector;
vector<int> preOrderSmallCollector;
vector<int> inOrderSmallCollector;

preOrderTraversal(big, preOrderBigCollector);
preOrderTraversal(small, preOrderSmallCollector);

inOrderTraversal(big, inOrderBigCollector);
inOrderTraversal(small, inOrderSmallCollector);

return isSubArray(preOrderBigCollector, preOrderSmallCollector) && isSubArray(inOrderBigCollector, inOrderSmallCollector);

}

 

This is one the efficient solution for this problem. The worst case time complexity for this problem is O(n). But in the video presenter has used a vague approach to find is subarray. Due to that the complexity worsens to O(n2). Therefore, If we use KMP pattern matching algorithm for the same then the ideal time complexity can be achieved.

author’s note:

This is my first post. As you have realized I have I used 3rd person style of writing. I can sometimes criticize myself from what I have done in the video ( of course, for the positive personal growth ). Hope you liked this post!

LIKE + SHARE + SUBSCRIBE to AM;YourMove

https://www.youtube.com/channel/UCkAou6vl72JU7DMFYs84YYg

Project: Personal website #2

Directives are core of angular2. Even components also come under directives. There are various kinds of directives.

Attribute directives

ngClass, ngStyle are 2 examples. They are used to apply css classes to the DOM elements.

Structural directives

*ngIf, *ngFor are 2 examples. These kind of directives are used to change the actual DOM elements. The “*” is the syntax to use these directives.
The general syntax to use *ngIf

<li *ngIf=”condition”>some-statement</li>

But *ngIf can also be written differently using property binding

<template [ngIf]=”condition”>

<li>some-statement</li>

</template>

In previous post I have just used a single dummy object to display the data. After learning about *ngFor directive we can use it to display collection of objects.
components:

I have realised that I did not explain much about how components are laid out and various events binding used.

about-component

This component is same as before. There is nothing much to be done other than changing some UI. I thought I make my UI more fancy once I get core angular2 part done! As of now it looks like this (below). I have used a some dummy image.

projects-component

I have used *ngFor to insert multiple projects as <li> items. Actually projects component is subdivided into project-list and project-detail component. The click event on any project item is registered and that particular item’s details are displayed other right side. I have created a class of type Project.ts to store project details.

skills-component

Similar to projects components I have used *ngFor to display list of items. I wanted to show professional and interpersonal skills separately hence I have used 2 more components for the about under skills component. I have created a class of type Skill.ts to store project details.

Contact-component

I’m keeping contact component as constant footer. I made the contact strip dynamic It means that how many contact objects I add in the future the strip can adapt. Contact component might have got biggest make over by far!

None the less I impressed by the progess I’m making. I’m also devloping another project along side to this for learning purpose. I’ll upload it to git so that you can look at it too. Right now, I’m having a bit of problem about using dropdown using materialize.css .

skills

 

See you in next!

Project: Personal website

I went to a couple of events in the month of november. There I got a chance to talk to amazing people who have good experience in the industry. As I’m about to start my “official” professional career I wanted to get some insights about how companies work and what are the technologies are being used. To my surprise 75% of people talked about angular. How an amazing technology it is and etc. So, Naturally I wanted to learn that as well !

Many people in the events talked about various online courses. I learned a lot of things on internet but never bought a course ! I love free stuff ! But people suggested that some paid courses are worthful. As angular2 is pretty new software there were no good free courses online. At the point of time I saw a ad on facebook that there is a black friday sale in udemy. I took no time to grab the offer and bought a angular2 course just for 14$.

I must say the course is really good. All the topics are carefully chosen and are explained. Therefore I’m learning the technology and at the same time building a personal website project. To actually understand angular2 fully I will surely build at least couple of projects.

Whole angular2 revolves around components more precisely directives. Therefore initially I have thought of components such as about, contact, projects, skills.

Project as of now: I have completed 4 sections in the udemy course. I have learnt till databinding, property binding, event binding, components.

components :

about-component

projects-component

projects-component

projects-component

skills-component

skills-compoenent

contact-component

contact-component

So, Clearly a long way to go ! But I guarantee that learning and mastering this technology will be a great asset to my career. And by the way I’m loving angular2. 

See you in next!