Recognizing the Work of Reddit’s Moderators: Summer Research Project

Beautiful example of mixed methods research. With all the quantitative data we can harvest today, but diving into participant observation and hoping to find contextualized insights.

Social Media Collective

What does it take to keep online communities going? With over 550,000 public subreddits, many of which are active, the communities on the site rely on ongoing effort by a large number of volunteer moderators. In my research, I’ve made the case that caring for the communities we’re part of is an important kind of digital citizenship. For that reason, I’m excited to learn more from redditors about how they see the work of moderation, why they do it, and what is/isn’t their job.

This spring, I’ve been reading extensively about digital labor and citizenship online, including the story of over 30,000 AOL community leaders who facilitated online communities in the 90s. With Reddit pushing for profitability and promising…

View original post 1,609 more words

Visualizing colored tables in R – I’m dying here

Documenting my R Learning Quest Vol.2

I have had for some time this need to present tables including some conditional formatting of cells. In my quest for the perfect table displaying library I have found nothing, it seems it is not a need for many. I’ll be storing here my attempts and discoveries.

Read more…

The Myth of Comprehensive Data

On using twitter data as if it were truly global and representative.

Dart-Throwing Chimp

“What about using Twitter sentiment?”

That suggestion came to me from someone at a recent Data Science DC meetup, after I’d given a short talk on assessing risks of mass atrocities for the Early Warning Project, and as the next speaker started his presentation on predicting social unrest. I had devoted the first half of my presentation to a digression of sorts, talking about how the persistent scarcity of relevant public data still makes it impossible to produce global forecasts of rare political crises—things like coups, insurgencies, regime breakdowns, and mass atrocities—that are as sharp and dynamic as we would like.

The meetup wasn’t the first time I’d heard that suggestion, and I think all of the well-intentioned people who have made it to me have believed that data derived from Twitter would escape or overcome those constraints. In fact, the Twitter stream embodies them. Over the past two decades, technological, economic, and political changes have…

View original post 972 more words

3 Ways to Recode Categorical Variables in R

Documenting my R Learning Quest Vol.1

So, sooner or later you will find the need to recode some variables or to ‘translate’ obscure values to more informative labels. Of course, there are several ways to do this, I’m just listing here the ones I have used during the first stages of my R learning quest and my current favourite.

Read more…

Data Visualization cheatsheet, plus Spanish translations

Great resource for those of us who have poor short term memory.

RStudio Blog

data visualization cheatsheet

We’ve added a new cheatsheet to our collection. Data Visualization with ggplot2 describes how to build a plot with ggplot2 and the grammar of graphics. You will find helpful reminders of how to use:

  • geoms
  • stats
  • scales
  • coordinate systems
  • facets
  • position adjustments
  • legends, and
  • themes

The cheatsheet also documents tips on zooming.

Download the cheatsheet here.

Bonus – Frans van Dunné of Innovate Online has provided Spanish translations of the Data Wrangling, R Markdown, Shiny, and Package Development cheatsheets. Download them at the bottom of the cheatsheet gallery.

View original post