Seminar: Readings in Social Computing Systems
Project stage II: Understanding rumor propagation
So far you have collected some ground truth information about known rumor and valid tweets. We will release a dataset of tweets collected by everyone in the first stage. This dataset will aid you in doing further analysis to understand how rumors propagate.
In this stage, we will focus on understanding how rumors propagate in Twitter compared to valid tweets. The high level goal is to explore the various dimensions of the data along which rumor and valid tweets might differ for a particular event. We have prepared a list of dimensions and relevant questions for each dimension to get you started. You should not treat these analysis guidelines to be exhaustive -- there may be other ways (e.g., other dimensions or other questions under each dimension) to characterize any difference between rumor and valid tweets and we expect students to come up with new ideas.
Dimension 1: Profile/network characteristics of users who posted the rumor and valid tweets
- Who are the originators of rumor tweets?
- Are they popular users or average common users? (Popularity can be measured in terms of in degree, list count, verified or not etc.)
- How does the social network of the early originators look like?
- How many followers and followings do they have?
- What is the ratio of the number of followers to followings?
- What are the characteristics of users who endorse a rumor? How many users are there who (1) refute a rumor or (2) question them and what are their characteristics?
- Based on the above analysis, are the characteristics of originators and posters of rumor tweets different from those associated with valid tweets of the same topic?
Related readings: [1], [2]
Dimension 2: Tweet characteristics of rumor and valid tweets
- Do rumor tweets contain URLs?
- Are there any particular domains that come up more often?
- Do rumor tweets with URLs endorse, question or refute rumors?
- Do rumor tweets contain hashtags?
- Are there any special hashtags for rumors?
- Can the co-occurrence of some hashtags be used to identify rumor tweets?
- How are mentions used in rumor tweets?
- Are the original sources given credit via mention?
- Do the tweets questioning or refuting the rumor, usually contain mention of the person who originally tweeted it?
- Are rumor tweets retweeted? If yes, how often?
- Can you use any of the above tweet content features (or any other as well) to distinguish between rumor and valid tweets, or between tweets endorsing, questioning, or refuting the rumor?
Related readings: [3]
Dimension 3: Information spread for rumor and valid tweets in Twitter
- How do rumor tweets spread in the social network?
- Do they form a cascade?
- What is the reach of the information flow in the social network?
- How do rumor tweets spread geographically?
- Do they originate/spread only in the area near the physical source of the rumor?
- What are the temporal characteristics of a rumor cascade?
- Does information flow patterns of rumor tweets differ from valid tweets based on the above analysis?
Related readings: [4], [5], [6]
Expected output
The various project milestones for stage 2 are below.
1. Analysis proposal presentation
Date of submission : 2nd July 2013, 11:59pm
During the 3rd July 2013 class you must give a short 10 minute analysis proposal presentation. You will have additional 15 mins to get feedback and answer any questions. The presentations have to be uploaded to the submission site by 28th May at the latest. by 2nd July 2013 at the latest. The presentations should be in ppt/pptx/pdf format and should have the following name: projectPhase2.status1.<lastname>.ppt/pptx/pdf
The purpose of this presentation is to present a proposal for your analysis. This includes presenting (i) your planned methodology for the analysis and (ii) the questions you are aiming to answer.
2. Status Update presentation
Date of submission : 9th July 2013, 11:59pm
During the 10th July 2013 class, you must give a short 10 minute status update presentation. You will have additional 15 mins to get feedback and answer any questions. The presentations have to be uploaded to the submission site by 9nd July 2013 at the latest. The presentations should be in ppt/pptx/pdf format and should have the following name: projectPhase2.status2.<lastname>.ppt/pptx/pdf
The purpose of this presentation is to give a status update consisting of (i) your progress so far, and (ii) your plan for the rest of the time.
3. Final Report :
Date of submission : 27th July 2013, 11:59pm
Please provide a final report describing your analysis results on rumor spreading behavior. Please upload your report to the submission site by 27th July 2013 at the latest.
While writing the report you should make use of neatly rendered tables/ figures/ plots/ maps/ or other visualizations to support your discoveries about the characteristics of rumor spreading.
- a. The final report needs to be at least 4 pages (single line spaced, font size: 10 point, single column).
- b. The name of the file should be: <Lastname1>.<Lastname2>.stage2.report.pdf.
- c. In this report, you must describe:
- i. Any additional data (excluding the provided corpus) collected while answering the questions (e.g. crawling the profiles of users posting rumor/valid tweets).
- ii. The methodology used and results obtained for each dimension described before along with any additional analysis to better quantify the differences between rumor tweets and valid tweets.
- d. At the end of the report, each member should specify his/her contributions in the project.
- e. Please give reference to any other sources you may be making use of in this stage for collecting, analysing or characterizing data.
4. Final presentation
Date of submission : 4th August 2013, 11:59pm
On 5th August, 10:00 am, you must give a final presentation for 15 minutes for this stage. You will have additional 15 mins to get feedback and answer any questions. The presentations have to be uploaded to the submission site by 4th August 2013 at the latest. The presentation should be in ppt/pptx/pdf format and should have the following name: projectPhase2.final.<lastname>.ppt/pptx/pdf
How to submit
You should make all your submissions using the project submission site .
Related Readings
- [1]. Measuring user influence in Twitter: The million follower fallacy (ICWSM’10) (PDF)
- [2]. A few chirps about Twitter (WOSN’08) (PDF)
- [3]. Why We Twitter: Understanding Microblogging Usage and Communities (WEBKDD’07) (PDF)
- [4]. On Word-of-Mouth Based Discovery of the Web (IMC’11) (PDF)
- [5]. What is Twitter, a social network or a news media? (WWW’10) (PDF)
- [6]. Who says what to whom on Twitter (WWW’11) (PDF)