Facebook Common Friends Calculation using MapReduce, Thesis of Accounting

How facebook uses mapreduce to calculate common friends between users and store the results for quick lookup. An example of how the map and reduce functions are implemented for this problem.

Typology: Thesis

2015/2016

Uploaded on 11/06/2016

Sumon.Biswas
Sumon.Biswas 🇬🇧

5 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Assignment 02
Questions:
Facebook has a list of friends (note that friends are a bi-directional thing on Facebook. If I'm
your friend, you're mine). They also have lots of disk space and they serve hundreds of millions
of requests everyday. They've decided to pre-compute calculations when they can to reduce the
processing time of requests. One common processing request is the "You and Joe have 230
friends in common" feature. When you visit someone's profile, you see a list of friends that you
have in common. This list doesn't change frequently so it'd be wasteful to recalculate it every
time you visited the profile (sure you could use a decent caching strategy, but then I wouldn't be
able to continue writing about mapreduce for this problem). We're going to use mapreduce so
that we can calculate everyone's common friends once a day and store those results. Later on it's
just a quick lookup. We've got lots of disk, it's cheap.
Assume the friends are stored as Person->[List of Friends], our friends list is then:
A -> B C D
B -> A C D E
C -> A B D E
D -> A B C E
E -> B C D
1. Find the map function using google mapreduce language.
2. Find the common friends of B and D using reduce function
Answer:
MapReduce is a framework originally developed at Google that allows for easy large scale
distributed computing across a number of domains. Apache Hadoop is an open source
implementation.
Each line will be an argument to a mapper. For every friend in the list of friends, the mapper will
output a key-value pair. The key will be a friend along with the person. The value will be the list
of friends. The key will be sorted so that the friends are in order, causing all pairs of friends to go
to the same reducer. This is hard to explain with text, so let's just do it and see if you can see the
pattern. After all the mappers are done running, you'll have a list like this:
For map(A -> B C D) :
1 | Page
pf3

Partial preview of the text

Download Facebook Common Friends Calculation using MapReduce and more Thesis Accounting in PDF only on Docsity!

Assignment 02

Questions:

Facebook has a list of friends (note that friends are a bi-directional thing on Facebook. If I'm your friend, you're mine). They also have lots of disk space and they serve hundreds of millions of requests everyday. They've decided to pre-compute calculations when they can to reduce the processing time of requests. One common processing request is the "You and Joe have 230 friends in common" feature. When you visit someone's profile, you see a list of friends that you have in common. This list doesn't change frequently so it'd be wasteful to recalculate it every time you visited the profile (sure you could use a decent caching strategy, but then I wouldn't be able to continue writing about mapreduce for this problem). We're going to use mapreduce so that we can calculate everyone's common friends once a day and store those results. Later on it's just a quick lookup. We've got lots of disk, it's cheap. Assume the friends are stored as Person->[List of Friends], our friends list is then: A -> B C D B -> A C D E C -> A B D E D -> A B C E E -> B C D

  1. Find the map function using google mapreduce language.
  2. Find the common friends of B and D using reduce function

Answer: MapReduce is a framework originally developed at Google that allows for easy large scale distributed computing across a number of domains. Apache Hadoop is an open source implementation. Each line will be an argument to a mapper. For every friend in the list of friends, the mapper will output a key-value pair. The key will be a friend along with the person. The value will be the list of friends. The key will be sorted so that the friends are in order, causing all pairs of friends to go to the same reducer. This is hard to explain with text, so let's just do it and see if you can see the pattern. After all the mappers are done running, you'll have a list like this:

For map(A -> B C D) :

(A B) -> B C D

(A C) -> B C D

(A D) -> B C D

For map(B -> A C D E) : (Note that A comes before B in the key) (A B) -> A C D E (B C) -> A C D E (B D) -> A C D E (B E) -> A C D E

For map(C -> A B D E) : (A C) -> A B D E (B C) -> A B D E (C D) -> A B D E (C E) -> A B D E

For map(D -> A B C E) : (A D) -> A B C E (B D) -> A B C E (C D) -> A B C E (D E) -> A B C E

And finally for map(E -> B C D): (B E) -> B C D (C E) -> B C D (D E) -> B C D

Before we send these key-value pairs to the reducers, we group them by their keys and get: (A B) -> (A C D E) (B C D)