Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

SOCIAL ANALYTICS
Assignment I: Who should Rebecca Engage with on Twitter?

Dr. Qi Deng

PART A (7 points)

OVERVIEW

Rebecca Lipstein is the Councilor of the Kitchissippi Ward in Ottawa. She is active on social media and would like to use Twitter to help promote a sustainability initiative within the ward (note: this is a made-up initiative for this assignment). Currently the Councilor has many twitter followers but she would like to ensure that her new initiative has the greatest reach and engagement possible.

Provided to you are several sources of data for your analysis.

DATASET

NoLipsteinConversationEdges.csv (You can download from Brightspace under Assignment 1)

This a data set detailing some twitter activity using keywords relevant to the Kitchissippi Ward. This data set only includes the “replies to” and “mention” edges. This data has been cleansed of twitter users who already ‘follow’ or are ‘followed by’ Councilor Lipstein. Thus, the data hopefully reflects conversations among people relevant to the Kitchissippi Ward who Councilor Lipstein has no social media connection to. Additionally, based on the keyword search used there may, or may not be, irrelevant data in the data set (data that has nothing to do ward or city of Ottawa activity). Further please note that some of the data is real and some has been fabricated for the purposes of this assignment. Interpretation of the data is as follows: The ‘source’ twitter user either ‘mentions’ or ‘replies to’ the ‘target’ twitter user.

TASKS

Perform the following steps on the data. Anything in RED requires you to take an action which results in an artefact to be included in your assignment submission.

1. Import the data set into Gephi as a directed graph (you can have Gephi automatically create the nodes).

2. Filter the data set so that you are only initially focusing on ‘Replies to’ relationships, ie. Get rid of the ‘mention’ edges (0.25 points)

3. Add an additional filter(s) to get rid of the disconnected nodes (0.25 points)

4. Run ‘modularity’ and other basic statistics we have discussed in class. Attempt to get ~8 clusters

5. Format your graph so that your clusters are colored (0.25 points)

6. Size your nodes based on how often a user is ‘replied to’ (replied to more often should be larger) (0.25 points)

7. For each of the 5 largest clusters in your network, do the following (1 points):

a) Identify the top “replied to” twitter user (who was replied to the most)

b) Based on this identification describe what seems to be the main topic(s) this user is getting “replied to” about (you will need to read the actual tweets in the data laboratory)
8. Create 1 or 2 appealing SNA graphs that capture the above analysis. For each graph, please do two screen captures (1 points):

a) The graph itself

b) Appearance tab, with the ranking of the size of nodes showing; the context tab; filter tab with an expanded view of the filters showing. To do this, close the graph window, then resize other windows so all requested information is visible. An example is shown below (note this is not based on the assignment data).