DocNow Community Chat: Social Media Research

#docnowcommunity chat

  1. These tweets were part of Documenting the Now's second Twitter chat on August 11, 2016. The tweets have been curated which means we've attempted to keep conversations and responses to questions grouped together, altering the purely chronological flow of the original stream. We also tried to include replies that did not use the #docnowcommunity hashtag. Please get in touch if you sent a tweet that you would like to have included or removed.
  2. Join us on Thursday, August 11th, 2016 at 2pm EST for a twitter chat on social media research. #docnowcommunity
    Join us on Thursday, August 11th, 2016 at 2pm EST for a twitter chat on social media research. #docnowcommunity
  3. Introductions

  4. Welcome to our Twitter chat about social media research. Please take a moment to introduce yourself #docnowcommunity.
  5. I’m Bergis Jules, one of the PI’s for the @documentnow project and I’ll be posting the questions today. #docnowcommunity
  6. hi #docnowcommunity, digital archivist & records manager at a public uni in Ohio. Mostly listening as I compile my #saa16 reimbursement req
  7. I’m Ian Milligan, historian and #webArchiving researcher. Looking forward to participating in today’s #docnowcommunity
  8. Hi I'm Ed Summers, i've done some analysis of Ferguson data with @SociologistRay and @BlackFeministMB & work on docnow #docnowcommunity
  9. Hi, #docnowcommunity. I’m Dorothea Salo, iSchool instructor, interested in digital (incl web, social media) and A/V preservation.
  10. Hi, I’m Shawn Walker, my research focuses on social media and political participation - especially by social movements. #docnowcommunity
  11. Hello all! Nicholas Proferes here, I study users' understandings and beliefs about how social media work #docnowcommunity
  12. I am Desiree Jones-Smith a member of the @documentnow team. I work to manage the project and am excited be a part of #docnowcommunity.

  13. Q1. What types of research do you do with social media: topics, disciplines, methods?

  14. Q1. What types of research do you do with social media: topics, disciplines, methods? #docnowcommunity
  15. @documentnow I research designing a method for building collections for stories and events esp. without domain knowledge. #docnowcommunity
  16. I'm interested in: characteristics of good collections, can we teach a machine to build good collections #docnowcommunity
  17. A1: quite a bit around collecting, analyzing, + thinking about open-source approaches. Work a lot with @ruebot on all this! #docnowcommunity
  18. A1. None yet, but based on a small survey we did w researchers, we anticipate future interest  #docnowcommunity
  19. I work with social media methods and the impact of the ephemerality of social media data on research. #docnowcommunity
  20. A1 Hello #docnowcommunity helping Dutch students collect and make publicly available tweets on their uprising against univ board
  21. @mjschaap Are you concerned about the University using this tweet collection for negative purposes? #docnowcommunity
  22. @fromADMwithlove no, but others might be and they are my concern, so it's more ethical #docnowcommunity
  23. @fromADMwithlove 2nd concern: copyright: is it lawful to harvest tweets from others and publish them on our website? #docnowcommunity
  24. lawful? twitter's ToS seem to say so (with attribution)...though who really knows. ethical? much harder to suss out 
  25. A1: Started working with social media content on @OurMarathon; particularly interested in archiving / analyzing memes #docnowcommunity
  26. @edsu I'm particularly interested in image macros / transformative use of images and text circulating on social media #docnowcommunity
  27. @edsu but yes, I was recently having a conversation about distinctions between "art" and "memes," so it's hard to define! #docnowcommunity

  28. Q2. What are some of the primary challenges you encounter in your research with social media?

  29. Q2. What are some of the primary challenges you encounter in your research with social media? #docnowcommunity
  30. @documentnow One of the greatest challenge involve the ethics and rights of using the social media archives we create. #docnowcommunity
  31. @documentnow 1.relevance to subject 2.balanced as much as possible (all bias represented).3.context available(conversation) #docnowcommunity
  32. A2: The ethical dimensions of archiving social media content; balancing close reading and distant reading #docnowcommunity
  33. A2: the black boxes of social media platforms, how to evaluate the completeness of what they offer is challenging #docnowcommunity
  34. A2: for me to ones of scoping (what to collect? what hashtags? is that enough?), as well as ethics & preservation Qs. #docnowcommunity
  35. @ianmilligan1 yes! knowing what to collect and how it was collected is an area that we are focused on in DocNow #docnowcommunity
  36. A.2 Understanding limits of Twitter’s terms of service is complicated in itself, no less how to share the data with others. #docnowcommunity
  37. A2:To scrape or not to scrape?sometimes getting stuff is easier by scraping-but scraping isn't stable/general/shareable #docnowcommunity
  38. .@acnwala that's really interesting -- what about scraping data isn't stable & shareable as compared to data from APIs? #docnowcommunity
  39. @edsu stability: if DOM changes, code might break if code relies on what changed (DOM usually changes slowly). #docnowcommunity
  40. @edsu not-shareable:since Twitter doesn't want to be scraped (terms of serv), I am reluctant to share any code that scrapes #docnowcommunity
  41. @edsu especially when it's about collecting threaded conversations, scraping is straight forward, api - not #docnowcommunity
  42. @edsu @acnwala also, scraping software is more easily broken since it usually depends on structures in the page. #docnowcommunity
  43. .@walkeroh @acnwala indeed, scraping is the epitome of transient software! more incentives for apis to be stable #docnowcommunity
  44. @acnwala @edsu something I worry about with archival datasets too: stability #docnowcommunity esp v/v how to handle deleted tweets
  45. .@documentnow for me, a big challenge is understanding what sources users draw on to develop beliefs abt how platforms work #docnowcommunity
  46. @edsu issues of publicness, transparency, and concept are difficult at this point in time. #docnowcommunity
  47. @edsu I struggle with the idea that public posts are open for use. What does consenting 20k+ accounts look like? #docnowcommunity
  48. @edsu while there are guidelines, it’s unclear how to apply these guidelines to social media platforms. #docnowcommunity
  49. A2: Not having access to tools / data that social scientists have #docnowcommunity
  50. .@JimMc_Grath what kinds of barriers are there for getting & using those tools? #docnowcommunity
  51. @fromADMwithlove for sure! Also access to skills / resources to be able to get lots of social media data #docnowcommunity
  52. @edsu money, labor, time, digital space: was thinking of the access to, say, Twitter firehose vs. getting data other ways #docnowcommunity

  53. Q3. Do you seek IRB approval (or an equivalent) as part of your research? Why or why not?

  54. Q3. Do you seek IRB approval (or an equivalent) as part of your research. Why or why not? #docnowcommunity
  55. A3: UMD IRB recently told @SociologistRay @BlackFeministMB & I that consent was not required for public tweets #docnowcommunity
  56. @chrisfreeland we were told we weren't doing research! which was kind of a blessing & a curse :-) #docnowcommunity
  57. @moduloone how do you assess the potential harms of an intervention in these spaces? #docnowcommunity
  58. @walkeroh Very carefully :) Really tho, its not just me assessing. I talk to others to get opinions, IRBs, colleagues, etc. #docnowcommunity
  59. @moduloone Great answer! These platforms present new harms, etc. So how do we expand our conceptions of harm? #docnowcommunity
  60. .@walkeroh We have individualistic notion of harm based on the empirical Community harm & harm to agency nd 2be considered. #docnowcommunity
  61. .@walkeroh @moduloone I'm interested in that question too. it seems harder than usual, but maybe because i haven't done it? #docnowcommunity
  62. A3: but when publishing tweets of activists it seems that care is needed like @freelon @meredithclark have in their work #docnowcommunity
  63. A3. We haven’t, but increasingly thinking that’s going to be the way things need to move. #docnowcommunity
  64. @ianmilligan1 Why do you belive that you will need to consider IRB aproval? Can you expand your thinking? #docnowcommunity
  65. @dpjones1983 I guess I’m becoming uneasy with the (at least for us) consent issues inherent in social media research #docnowcommunity
  66. @dpjones1983 i.e. for us historians, we go through IRB for oral history but not public accessible tweets. #docnowcommunity
  67. A3: during the occupation of Amsterdam uni there were people who 'did not want to be seen', suppose/hope they didn't tweet? #docnowcommunity
  68. .@mjschaap did you work with social media data from the occupation? #docnowcommunity
  69. @edsu indeed, the students estimate we will have to collect ca 6000 tweets from the time of the 5wk occupation #docnowcommunity

  70. Q4. What tools are you currently using as part of your social media research?

  71. Q4. What tools are you currently using as part of your social media research? #docnowcommunity
  72. @fraistat Mostly I use tools I’ve developed for Twitter data collection. I use heritrix for web archiving. #docnowcommunity
  73. @fraistat preservation of metadata is important to my work. Most tools truncate posts post/profile metadata. #docnowcommunity
  74. @fraistat not yet, but that’s something I’m working on. I’m happy to share privately until they’re more polished. #docnowcommunity
  75. A4. and a combination of Dataverse, Zenodo (Invenio), GitHub, & #Islandora for sharing & providing access to our datasets. #docnowcommunity
  76. A4. The always amazing twarc by @edsu; @ruebot has built corpora from tweets using Heritrix; plus lots of bash scripting. #docnowcommunity
  77. .@walkeroh how important is it for researchers to build tools themselves, so they understand how they work? #docnowcommunity
  78. @edsu it’s important to not conflate ability to build tools and understanding the epistemologies/methods under the hood. #docnowcommunity
  79. @edsu most tools aren’t well documented except by the code, so how do potential users evaluate — especially researchers? #docnowcommunity
  80. .@walkeroh yes, that's what i'm asking; if researchers reach for their own tools because they will understand their limits #docnowcommunity
  81. @edsu In light of that, what questions can we ask of the data while respecting its limitations? #docnowcommunity
  82. @edsu If tool building isn’t the focus of your research then it’s taking time away from it, right? #docnowcommunity
  83. .@walkeroh seems like step one is being able to talk about the limitations -- which is why I'm a big fan of your work. #docnowcommunity
  84. A question I've been thinking about is how we can use metadata to support ethics in opendata sets. #docnowcommunity 
  85. Like, how do we create provenance to support ethics as part of metadata standards? #docnowcommunity
  86. @edsu Creating more researchers focused on these limitations would be a good step. but where should the output go? #docnowcommunity

  87. Q5. How important is it for you to publish/share your research data? If yes, how do you do it?

  88. Q5. How important is it for you to publish/share your research data? If yes, how do you do it? #docnowcommunity
  89. A5: I put the Ferguson dataset we worked with up on Internet Archive, but only as a dataset of Tweets IDd (per ToS) #docnowcommunity
  90. A5: please do for you're far ahead in the US, bit pioneering here so I need all the inspiration i can get from you!!! #docnowcommunity
  91. A5: very important for scholarly work for reproducibility. Sharing tweet IDs for large dataset #docnowcommunity
  92. .@acnwala Big issue for consent! Many users don't know abt the dozens of metadata fields that may be pulled up #docnowcommunity
  93. A5: I've published Twitter IDs in the past per its TOS. Ppl still ask me for the full datasets! #docnowcommunity
  94. .@dfreelon thanks for doing that! it has been very useful to the @documentnow project #docnowcommunity
  95. A5: But Twitter datasets degrade quickly. Our BLM dataset suffered a 10% attrition rate only a year later. What to do? #docnowcommunity
  96. .@dfreelon yes, I still get some too for our Ferguson dataset; do any of them end up hydrating the IDs? #docnowcommunity
  97. @edsu Yes, I know of at least one case where someone published a paper based on rehydrating Twitter IDs I originally collected.
  98. @acnwala I know, I did some of the earliest published work on Arab Spring tweets. @hanysalaheldeen
  99. @dfreelon Do you think it would be useful for those folks to have a simple tool to recreate the tweets from the IDs? #docnowcommunity
  100. @documentnow that would be nice, but 1. still need code to handle large data volumes and 2. attrition still a big problem
  101. A5: Sharing data is very important. We share it within our lab. Share IDs for those outside our lab. Not ideal at all! #docnowcommunity
  102. .@walkeroh do you keep any record of the data's provenance when you are doing that sharing? #docnowcommunity
  103. A5 Challenge is finding the best way to model & share these datasets, along w/the best documentation & metadata around them #docnowcommunity
  104. A5 @scholarsportal gets us a hdl, but replicating the dataset w/sameAs in Zenodo gets us a DOI, which is nice for analytics #docnowcommunity
  105. A5. All that said, another challenge is getting folks to actually cite your data when they use it. #docnowcommunity

  106. Q6. What social media platforms do you focus on for your research?

  107. Q6. What social media platforms do you focus on for your research? #docnowcommunity
  108. A6: Just Twitter, ‘tho am curious to maybe try using some Reddit data at some point .. #docnowcommunity
  109. A6: Mostly Twitter and FB, but always looking to expand. Have a forthcoming paper using federal comment data, fun stuff #docnowcommunity

  110. Q7. Do you see any opportunities for improved social media collection and analysis tools?

  111. Q7. Do you see any opportunities for improved social media collection and analysis tools? #docnowcommunity
  112. A7: A code-free frontend for Twarc would be awesome for smaller projects. #docnowcommunity

  113. Q8. Are you interested in using tools that can extract images and videos from tweets?

  114. Q8. Are you interested in using tools that can extract images and videos from tweets? #docnowcommunity
  115. Q8: Yes, images and links are an integral part of many social media posts though most analysis treats them as only text #docnowcommunity
  116. A8: Something that pulled the most RTed/faved/commented images and videos would be very useful #docnowcommunity
  117. @dfreelon We have some good stuff to share with the advisory board during the upcoming meeting. #docnowcommunity
  118. A8: I wrote some code to do this for our BLM report but it's not well-documented... #docnowcommunity

  119. Additional related conversation, that wasn't in response to a specific question.

  120. @documentnow My diss on users' beliefs about info flow on Twitter. (Condensed ver. hopefully soon)  #docnowcommunity
  121. @documentnow is there an existing zotero group or other place to curate papers and other resources? #docnowcommunity
  122. @walkeroh Not sure but there is a lot of sharing in our slack channel. About 150 members so far. #docnowcommunity
  123. .@acnwala Here's an awful question: If what Twitter is is constantly changing, is reproducability actually possible? #docnowcommunity
  124. .@moduloone very difficult to impossible:Twitter's tools seem to favor the now, archives may be crucial for reproducibility #docnowcommunity
  125. @moduloone @acnwala this is an issue for research in general, right? Social media data put a very special spin on it. #docnowcommunity
  126. @walkeroh @moduloone absolutely, research papers - esoteric, code links broken, datasets - god sent #docnowcommunity
  127. @moduloone @acnwala if we archive the data and are willing to share it eventually, we can create a dataset and make it stable
  128. .@diuhtez What if 10% of tweets later deleted by users? You can ignore, put this pits reproducability vs respecting users. #docnowcommunity
  129. .@diuhtez Good point! Does seems to be about the scale of time we need reproducibility in. #docnowcommunity
  130. This has been a really great conversation. Thanks for joining us. Storify coming soon. Feel free to continue the chat here #docnowcommunity
  131. Adding links to any website - You can also embed a link to any website, like the official site for a company or event, a Wikipedia page to give background on a subject, or anything else that might give your readers more information. Click the Google source to search for the right site. If you know the direct URL of something you want to embed, use the Embed URL source (the icon looks like a link) and enter it there.
  132. Notify -
    Because your stories are social, you can also let the people who are quoted know that they are now part of your story. This is a great way to help your story spread further, as people who are quoted are likely to also share it with their friends. After your story is published, you will be prompted to use the Notify feature. Give it a try - we think you'll love the reaction you get!
  133. Feedback, questions? - Got questions, problems or thoughts about Storify? Please tell us! Send us a tweet to @storify, post to our Facebook page or email
  134. Enjoy, and thanks for using Storify!