Have you ever wondered how social networks, such as Facebook, recommend friends to you? Most of the social networks use highly sophisticated algorithms for this, but for this assignment you will implement a fairly naive algorithm to recommend the most likely new friend to users of a social network. In particular, you will recommend the most probable user to befriend based upon the intersection of your common friends. In other words, the user that you will suggest to Person A is the person who has the most friends in common with Person A, but who currently is not friends with Person A.
Five text files have been provided for you to run your program with. Each represents a social network. Three are small test files containing a made-up set of users and their friendships (these files are net1.txt, net2.txt and net3.txt). The two are a subset of a real Facebook dataset, which was obtained from: fhttps://snap.stanford.edu/data/egonets-Facebook.html
The format of all five files is the same:
The first line of the file is an integer representing the number of users in the given network.
The following lines are of the form: user_u user_v where user_u and user_v are the (non-negative integer) IDs of two users who are friends.
In addition user_u is always less than user_v
For example, here is a very small file that has 5 users in the social network: 5
0 1
1 2
1 8
2 3
The above is a representation of a social network that contains 5 users.
User ID=0 is friends with User IDs = 1
User ID=1 is friends with User IDs = 0, 2, 8
User ID=2 is friends with User IDs = 1, 3
User ID=3 is friends with User IDs = 2
User ID=8 is friends with User IDs = 1
Spend time studying the above small example to understand the model. For example, notice that since friendship is a symmetric relationship the social media networks in this assignment, if user_u is friends with user_v, that means that user_v is also friends with user_u. Such "duplicate" friendships are not present in the file. In particular each friendship is listed once in such way that user_u < user_v
Also note that, while you can assume that user IDs are sorted, you cannot assume that they are consecutive integers differing by one. For example the user IDs above are: 0,1,2,3,8.
You can also assume that in each file the users are sorted from smallest to largest (in the above example you see that users appear as: 0 1 1 2). Specifically, friendships of user_u appear before friendships of user_v if and only if user_u < user_v. And also for each user its friends appear sorted, for example for user 1 friendship with friend 2 appears before friendship with friend 4.
To complete the assignment you will have to code the following 9 functions. I strongly recommend you code the in the order given below and do not move onto coding a function until you complete all before. The function descriptions, including what they need to do, are given in a4_xxxxxx.py.
1. create_network(file_name) This is the most important (and possibly the most difficult) function to solve. The function needs to read a file and return a list of tuples representing the social network from the file. In particular the function returns a list of tuples where each tuple has 2 elements: the first is an integer representing an ID of a user and the second is the list of integers representing his/her friends. In the a4_xxxxxx.py I refer the list that create_network function returns as a 2D-list for friendship network (although one can argue that is is a 3D list). In addition the 2D-list for friendship network that must create_network function returns must be sorted by the ID and a list of friends in each tuple also must be sorted.
So for the example above, this function should return the following 2D-list for 2D-list for friendship network:
[(0, [1]), (1, [0,2,8]), (2,[1,3]), (3,[2]), (8,[1])]
More examples:
>>> net1=create_network("net1.txt")
>>> net1
[(0, [1, 2, 3]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6, 8, 9]), (3, [0, 2, 8, 9]), (4, [1, 6, 7, 8]),
(5, [9]), (6, [1, 2, 4, 8]), (7, [1, 4, 8]), (8, [2, 3, 4, 6, 7]), (9, [1, 2, 3, 5])]
>>> net2=create_network("net2.txt")
>>> net2
[(0, [1, 2, 3, 4, 5, 6, 7, 8, 9]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6,8, 9]), (3, [0, 2, 8, 9]), (4, [0, 1, 6, 7, 8]),
(5, [0, 9]), (6, [0, 1, 2, 4, 8]), (7, [0, 1, 4, 8]), (8, [0, 2, 3, 4, 6, 7]), (9, [0, 1, 2, 3, 5])]
>>> net3=create_network("net3.txt")
>>>
[(0, [1, 2, 3, 4, 5, 6, 7, 8, 9]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6,8, 9]), (3, [0, 2, 8, 9]), (4, [0, 1, 6, 7, 8]),
(5, [0, 9]), (6, [0, 1, 2, 4, 8]), (7, [0, 1, 4, 8]), (8, [0, 2, 3, 4, 6, 7]), (9, [0, 1, 2, 3, 5]),
(100, [112]), (112, [100, 114]), (114, [112])]
>>> net4=create_network("big.txt")
>>> net4[500:502]
[(500, [348, 353, 354, 355, 361, 363, 368, 373, 374, 376, 378, 382, 388, 391, 392, 396, 400, 402, 404, 408, 409, 410,
2. getCommonFriends(user1, user2, network)
>>> getCommonFriends(3,1,net1)
[0, 9]
>>> getCommonFriends(0,112,net3)
[]
>>> getCommonFriends(217,163,net4)
[0, 100, 119, 150]
3. recommend(user, network)
Read the docstrings to understand how this function should work. Understand why the given friends are recommended in the examples below including why no friend is recommended for 0 in net2 and 112 in net 3.
>>> recommend(6,net1)
7
>>> recommend(4,net2)
2
>>> recommend(0,net2)
>>> recommend(114, net3)
100
>>> recommend(112,net3)
>>> recommend(217,net4)
163
4. k_or_more_friends(network, k)
>>> k_or_more_friends(net1, 5)
3
>>> k_or_more_friends(net2, 8)
1
3
>>> k_or_more_friends(net3, 12)
0
>>> k_or_more_friends(net4, 70)
33
5. maximum_num_friends(network)
>>> maximum_num_friends(net1)
5
>>> maximum_num_friends(net2)
9
>>> maximum_num_friends(net3)
9
>>> maximum_num_friends(net4)
347
6. people_with_most_friends(network)
>>> people_with_most_friends(net1)
[1, 2, 8]
>>> people_with_most_friends(net2)
[0]
>>> people_with_most_friends(net3)
[0]
>>> people_with_most_friends(net4)
[0]
7. average_num_friends(network)
>>> average_num_friends(net1)
3.8
>>> average_num_friends(net2)
5.0
>>> average_num_friends(net3)
4.153846153846154
>>> average_num_friends(net4)
19.78
8. knows_everyone(network)
>>> knows_everyone(net1)
False
>>> knows_everyone(net2)
True
>>> knows_everyone(net3)
False
>>> knows_everyone(net4)
False
9. get_uid(network)
>>> get_uid(net1)
Enter an integer for a user ID:alsj
That was not an integer. Please try again.
Enter an integer for a user ID: twenty
That was not an integer. Please try again.
Enter an integer for a user ID:9aslj
That was not an integer. Please try again.
Enter an integer for a user ID:100000
That user ID does not exist. Try again.
Enter an integer for a user ID:4.5
That was not an integer. Please try again.
Enter an integer for a user ID: -10
That user ID does not exist. Try again.
Enter an integer for a user ID:-1
That user ID does not exist. Try again.
Enter an integer for a user ID:7
7