Google+
Showing posts with label visualization. Show all posts
Showing posts with label visualization. Show all posts

Thursday, March 12, 2015

How to create an M&E dashboard using Google Apps

Last year we did a fair amount of work with IDRC to set up a KM platform for a new collaborative research program. In the follow up to that project, we developed an M&E system for the program, using the same technology infrastructure used to build the platform itself - the Google Apps for Business.

After last week’s case study on building the R4D dashboard with Tableau public, in this post I’m presenting how to set up a M&E system and dashboard using a combination of various Google Apps and free third party tools. This post is very much an overview of the process and the final product we delivered. In the next blog posts in this series I’ll look at the specific tools used from a more technical perspective.
M&E Dashboard - Click to enlarge

Who needs a dashboard, and why? 


This IDRC program is made of 4 different research consortia and the IDRC program team in Canada. Further, each consortium works on a specific issue related to climate change and adaptation. In doing so, it brings together different organizations geographically dispersed. So as collaboration is the basis of the program the M&E system had to follow this principle. So our brief was to “design platform-based, collaborative tools to collect monitoring data on up to eight key indicators in the Monitoring Framework.”

Ultimately, these data had to be brought together into a M&E dashboard that could be easily shared with donors and program leads as a link or quickly printed in PDF at regular intervals. As for the R4D dashboard, also this dashboard had to provide a “snapshot of progress against key indicators in the program's monitoring framework using data entered by consortia and the IDRC Team.”

So what are these indicators?

What to measure? Theory of change and monitoring framework 

The program M&E working group had already produced a solid Theory of Change with three clear objectives; for each they had defined the dimensions and potential metrics to be included in the M&E system. This made our job easier as it was clear from the outset what had to be measured and for what purposes. So we just had to help the team unpack a bit the various metrics and dimensions, and define the exact indicators and values to be tracked in the system:
  • Research outputs and pilots, including indication of type of outputs, authorship (gender and country), quality of outputs and their accessibility (whether peer-reviewed and/or openly accessible on the web), etc... 
  • Web traffic, social media and engagement data, such as web sessions and downloads, Twitter followers and number of conversations and Tweeps around specific accounts and Hashtags, media tracking, number of events and participants rating, etc. 
  • Grants and awards distributed, including gender and location of recipients.
M&E Dashboard - Click to enlarge

When and where? Data collection process and storage 

While the system (and resulting dashboard) was planned to be updated quarterly, we agreed on the principle that data collection would be automated when possible, and manual when other solutions were not at hand. As a result:
  • Data around web traffic and social media are automated or semi automated, using a series of third party tools and applications (I’ll talk about this specifically in the next blog post) 
  • Data around research outputs, pilots, grants and awards are entered via users’ submission forms, using Google Forms. While forms can (potentially) be submitted by anyone who has a user account on the KM platform, in reality specific users for each consortium are responsible for this process, while others are responsible for quality control, to ensure that entries are complete and there are no duplicates. 
Regardless of how the data are collected - manually, automatically or semi-automatically - they all feed into one of the 3 separate log files set up for the the three objectives in the Theory of Change. Google Spreadsheet are used for these log files, and the appropriate sharing and editing permissions are in place.

How to display the data? Platform, design, prototype and production 

On the basis of an initial sketch of the dashboard produced by the IDRC team, we populated the log files with dummy data and produced two different prototypes, one using Tableau Public and one using Google Charts and publishing them into a Google Site. We agreed to use Google tools to avoid adding another layer of complexity to the system and keep it all inside Google Apps. Additionally, as Charts are generated by the log files, when the log files are updated so are the charts on the live site, which is a great short-cut, cutting down work on updating the dashboard.

Similarly to the R4D dashboard, this program dashboard presents a tabbed navigation at the top, with one tab for each of the objectives monitored in the framework. This way we could present objective-specific charts, tables and figures in a clean, uncluttered interface.

Additionally, the main tab of the dashboard presents what we called ‘curated content’, such as a selection of recent publications, blog posts or key events that are hand picked by the dashboard administrators to highlight specific information.
M&E Dashboard - Click to enlarge

What next? Possible platform iteration and next blog posts 

This dashboard became at the beginning of 2015 and its second update is planned for the end of this quarter, so it’s too soon to evaluate it and think about possible iterations. However, feedback received from users has been positive so far and the system delivers the required information to the different target users.

In my opinion, a possible way to improve it would be to add filters and controls to the charts currently published on the dashboard, so that users can interact with them, browse for specific period of time, make comparisons, and get more out of this visual representation of data.

To do this requires working with Google Apps Scripts, JavaScript cloud scripting language that provides easy ways to automate tasks across Google products and third party services. I’m not a programmer but I like learning new things and findings solutions that others have already implemented. So also in the current version of this dashboard I’ve made use of Google Apps Scripts to collect data and to copy them from one spreadsheet into another.

If you are interested in what Apps Scripts I’ve been using and what they can do for you, subscribe to the blog and sit back till you’ll get my next post in this series - or share your experience in the comments below here.

Thursday, March 05, 2015

R4D dashboard: Visualize web access to DFID funded research

Collecting traffic and usage statistics for a website or portal can be a very time consuming and tedious task. And in most cases you end up compiling monthly or quarterly report for managers and donors that will be shared as email attachments - and at best skimmed, since there is so much information. But there are smarter ways you can do this process and bring life into your data, as as I explained in my previous blog.

Our case study is the R4D portal, a free access on-line portal containing the latest information about research funded by DFID, including details of current and past research in over 40,000 project and document records. Until 2013 we were part of the team supporting and managing the site.

As part of our work packages, we developed an online, interactive visualization of web traffic and usage of the R4D portal and its social media channels. The R4D dashboard, built using Tableau Public, is still updated and in use. However, since the termination of our support contract, it hasn't been iterated and improved since 2014.

This posts presents the process we followed to develop the dashboard, the tools used and the lessons learned in what was very much a learning by doing journey.

 Why develop the R4D dashboard? 

The collection of usage and traffic data for R4D used to be pretty much standard: a series of excel files updated monthly to generate charts and graphs. They were then put together in a PDF report and shared with project leads at DFID. The idea to develop instead an online, public dashboard of R4D web traffic and usage was inspired by the excellent work from Nick Scott and ODI, which he shared with us during a Peer Exchange session we organized back in 2012.

Donor organisations such as DFID collect a lot of statistics and indicators but these are often kept within projects and programmes and not made available for all staff, as was the case for R4D. So the reason behind the R4D dashboard was primarily to open up our stats and make them more accessible to anybody interested in it, not just the people that had sign off on the project.

Also, by encouraging a more open approach to web stats, the idea was also to have more terms of comparison: it is difficult to evaluate how well your website is doing if you can only compare against yourself. So being able to see how much traffic similar websites are generating will help you assess your own effort and performance.


So what did we do?

Process wise, we pretty much followed the steps indicated in my previous blog posts. With the primary audience well in mind, we started to select the metrics to include in the dashboard:
  • Website stats: Visits and visitors; referring sites; visitors by country; PDF downloads and top pages. 
  • RSS feeds subscribers Twitter clickthroughs and Facebook Insights data (later removed)
  • Number of outputs added to the R4D database (by type, for example open access articles, peer review articles, etc…) 
We decided that it was feasible to collect this data monthly as xls or cvs files exported to from the site(s) and save them into a shared Dropbox folder. This was the most effective way as data collection was decentralized with different people working on different platforms. With our limited budget, it was not possible to automate the data collection process, so this was entirely manual.

Software platform selection took quite some time in the initial phase of the process. We selected Tableau Public as our dashboard platform, and then had to invest more time in learning its features and functionality. But it was totally worth it!


Why Tableau? 

Tableau Public is free software that can allow anyone to connect to a spreadsheet or file and create interactive data visualizations for the web. There are many tutorials out there if you just Google for it, so I’m not going to tell you here how it works in details. But here are my top reason for using Tableau Public:
  • It's free! Well, that's a good reason already if you don't have resources to invest in business intelligence or visualization software - and normally the cost for these are steep and way outside the budget of the organizations we work with; 
  • It's intuitive. You don't need to be an expert to use the tool. The interface is very simple (drag and drop) and you can easily find your way around. 
  • It's rich and deep. There are so many charts you can choose from and you can play around with different visualization until you are happy with the result. It also goes much deeper than Excel with analysis and interactions.


What did we learn? 

Besides learning how to use Tableau Public itself, here are the main things I learned along and around the process of developing the R4D dashboard:
  • Google Analytics is the industry market standard - but it tends to under-count your traffic.
    We ran two different website analytics packages on the main R4D portal - Google Analytics (GA) and SmarterStats - and noticed a huge difference in the results, with GA massively under-counting visits and visitors. So it's always worth installing another tracker to be on the safe side. 
  • Updating Tableau is quick - but getting the data manually isn't
    Once your dashboard is set up, the process of updating it with new data is rather quick, just a few clicks and you are done. However data collection from the various sources in our case was mostly manual and it can be time consuming (and not much fun either!). If I were still working on the project, I’d look into ways to automate as much as possible data collection - while also looking at what additional (useful) data I could collect in an automated way. 
  • Build it once - and then you *must* iterate
    When you're done building your dashboard, you're actually not done. We had a couple of iterations before arriving at the product that is now online. And I'm sure this would be different now had the project continued. This is because you have to evaluate whether the visualizations in the dashboard are actually useful and provide you actionable insights that can inform your strategy. Or simply because the software keeps evolving and can give you new possibilities that were not there before.

In the next post on this series I'll present a different approach to develop an M&E dashboard, this time using a combination of Google Forms, Sheets and Charts, together with Google Apps Scripts and Google Docs Add Ons.

In the meantime, if you have experience with Tableau or use other tools to create interactive dashboards, why not share it in the comments here?

Thursday, February 26, 2015

How to create a monitoring and evaluation dashboard

So you have a website, a blog, the usual social media channels on Twitter/Facebook/YouTube, maybe a series of RSS feeds. On top of this, your organization or research programme also publishes original content, or indexes content produced by others into an online portal. And you also organize events and workshops and maybe offer grants and awards.

With all these online spaces, outputs and products that you produce, how are you going to collect and aggregate this data as part of your monitoring and evaluation activities? And how are you going to display and present it in an effective way that can be easily understood by your co-workers, managers and donors?

For the past couple of years, I’ve been experimenting with tools to display data and information in online dashboards. This post presents a short introduction to the topic. It’s the first of a series of posts that will look into online tools for data collection, storage and display.

What is a dashboard? 

According to Stephen Few’s definition “A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.”

More generally, a dashboard is a visualization of data related to the operation and performance of an organization, program or service. Dashboards are helpful tools to provide you with a quick overview to see whether you’re on track to reach the objectives stated in your logframe or Theory of Change.

Note that information about a wide range of channels can crowd one screen. So it’s important to be flexible and keep users in mind - keeping scrolling *very* limited and using features like tabbed navigation to view different set of metrics and indicators.

What are the steps to follow to build a dashboard? 

The Idealware report Data at a Foundation's Fingertips: Creating and Building Dashboards presents an excellent and detailed step-by-step description of the process to design effective dashboards for non profit. Ultimately the process boils down to 4 main phases:

  1. Define your audience
    Of course this is absolutely critical, determining the way you design it, the graphs you include, their order and sequence. The dashboards I’ve developed in the past were mainly designed for managers and executives, to tell them about the progress of a program or service at a quick glance.
  2. Identify the metrics to display - and how you collect them
    With the tons of metrics that you could collect, and the space limitations of a dashboard it is important to agree upfront which ones will be displayed in the dashboard. So it requires a bit of negotiation to agree upon what’s in and what’s out. Of course, the metrics should be useful in terms of monitoring progress towards the objectives in your logframe and theory of change. In this phase it is also important to discuss collection methods, frequency and access. 
    • Are there any process that you can automate? 
    • What is only possible instead through manual data collection?
    • And is it realistic to collect this data monthly - if the properties are high traffic or include active campaigns, for example, - or is quarterly more realistic?
    • Where are you going to store the raw data and who should have access to it?
  3. Identify your dashboard platform
    This is a maturing market so there are a lot of possible solutions - from expensive business intelligence software to low cost or free tools. Generally the decision is defined by the resources available as well as the time you and your users have to invest in learning new tools. Note that while potentially you can build a dashboard in Excel, investing some time in learning how to use a powerful and flexible dashboarding tool such as Tableau Public can enable you to design more complete and effective dashboards.
  4. Sketch, prototype and roll out
    In the design of the dashboard you need to find a good balance between the amount of information you want to display and the limited space available. So you have to carefully decide what graphs and chart you will use, what explanatory text you should include, which colours to use when...This will take a lot of testing and iterating to find the optimal design. Bottom line, your final product should: 
    • Be simple, intuitive, easy to read and understand; 
    • Present data together from different sources in an uncluttered way and following a logical sequence or order; 
    • Offer a quick overview of the key metrics and indicators to assess progress towards the objectives of your program/organization/service. 
In the following posts, I’ll be presenting two case studies about work we did recently on visualizing monitoring and evaluation data into online, interactive dashboards. I will look specifically at the tools  used to put these dashboards together, as well as the individual tools used to collect and store individual indicators and metrics. 

For the more techie readers, I’ll also share the details of what I’ve learned recently using Google Apps script to automate some data collection and storage processes, as well as tips and tricks to monitor activities and engagement around Twitter, which I’ve been experimenting a lot with lately. So stay tuned!

Monday, December 02, 2013

Do more with your tweets - Social reporting at ICT4Ag

In my previous post about tracking the online buzz generated around the ICT4Ag conference I’ve mentioned a couple of the tools that we used in the social reporting of the event. At the ICT4Ag conference, blogs, workspaces, Twitter and the other ‘usual suspects’ formed the backbone of our technology infrastructure.

However, besides these tested and tried solutions, this time around I wanted to experiment with some new applications around the edges of the process.

In this and the following posts I’m going to zoom in and focus on technology, presenting some new (to me at least) applications that I’ve used and what I’ve learnt - and what I would do different next time to further develop my own practice.

Specifically, this post looks at different tools to do (much) more with Twitter.

Conference structure, KM and Twitter #tags

The conference programme was organized around 3 main conference streams, a Plug and Play day and a Hackathon running in parallel to the main event. While it was a no brainer to decide the conference tag and the tags for Plug and Play and Hackathon, the design of the conference in streams and sessions presented some challenges in terms of effective use of #tags in Twitter.

On the one hand, we were expecting (as it indeed happened) quite some buzz on Twitter - and we needed to have ways to aggregate and disaggregate tweets around different conversations. On the other hand, CTA KM team was also interested in finding ways to archive and search the different tweets - around the conference #tag and specifically according to the different conference streams. Finally, we wanted to have the possibility to display specific twitter feeds on each tabs of our social dashboard.

So once we had decided a basic #tag vocabulary, with specific #tags for each session, I was faced with the challenge to find ways to:
  • Extract information from Twitter a Search - possibly a feed search results for each session #tag; 
  • Aggregate this info to display twitter feeds around each of the three conference streams; 
  • Find a way to archive and search these Tweets.

Display and aggregate Twitter search RSS feeds

This sounded complicated to achieve when I first looked at it. With the recent changes in Twitter API policy, users are no longer able to obtain any of the Twitter streams – search results, timelines of users, users’ favorites etc – in an RSS feed.

After spending some time researching and testing online what tools I could use to get a Twitter RSS feed, I found that some smart folks out there had the answer I was looking for and I was able to use some simple Google Scripts to set up a Twitter RSS feed for each session #tag search result.

Once I had created these 25 odd Twitter search feeds (one for each session #tag), I was then able to aggregate them into consolidated feeds for each of the 3 conference streams, using Yahoo Pipes. So for example, out of the Twitter search feeds for each of the 12 sessions in the Emerging Innovation stream, I was able to produce a single, aggregated Twitter feed.

Never miss a Tweet!

Once I’d solved this challenge, I was still presented with the need to find ways to archive tweets - as after a while they disappear from the search results. Plus, this Twitter archive had to be searchable, to meet the needs of the CTA team to browse and use this content in the future.

The solution here came from a brilliant post by Martin Hawksey, where he presents TAGS5.0 - a way to archive AND visualize tweets, automatically pulling results from a Twitter Search into a Google Spreadsheet.

This was indeed a great discovery and using it has proved to be of great interest and value

The full Twitter archive for the conference hashtag #ict4ag13 is available as a public Spreadsheet on Google Drive. You can also explore the interactive visualization of the conversations on Twitter, mapping replies, retweets and mentions, and the relations between different users. Finally, it gives you a complete Twitter archive that can be easily browsed and searched by keyword or by username.

Together with one main Spreadsheet for the conference hashtag, I’ve also repeated the process and created different Spreadsheets for:
For each of them, you can visualize the interaction and search the online archive of tweets (see on Summary tab, then Public web views).

Back up if you can

While the whole setup was quite easy, I did encounter one main technical problem. The script failed to archive several tweets for two days of the conference - maybe due to the very high volume of tweets that were generated.

Luckily I had also set up a backup for the #tag search in Hootsuite and I was the able to reconstruct the full database in the days after the event. But of course this is not ideal, as it require quite some time in moving data from one spreadsheet to the other.

Despite this, the end result is quite useful I believe, especially as it enables the preservation of the tweets for future reference and research, and to compare different events in terms of online conversation and engagement on Twitter.

I’d also like to see if the visualization of the Twitter conversation can be improved and provide a clear picture of the connections.

Any expert out there that would like to give it a try and improve the visualizations from the complete dataset?

 - 

Thursday, November 28, 2013

Tracking reach, understanding engagement - Social reporting at ICT4Ag

Over the past few weeks, I’ve been busy with the preparation and the coordination of the social reporting team for the ICT4Ag conference in Kigali. I'll blog separately about the process and learnings (actually, a lot!) in terms of both technology and people for a successful social reporting effort.

But if you are interested in discussing Twitter reach and online engagement around the event, please stay with me for a few minutes and keep reading.

The starting point for this conversation is the post that my friend and colleague Pete Cranston published few days back. I’m grateful for the questions he poses, as they couldn't provide a better framework for my reflection.

It’s two million, actually...

On one thing Pete is not not correct though. According to Keyhole, the tool we used to track the conversations around the conference hashtag #ict4ag13 (on Twitter, Facebook and Instagram) the total reach is over 2 million!

Together with the reach (defined as "the number of unique followers that a users has - and so the unique number of people that a tweet could potentially get to"), Keyhole also tracks impressions, number of posts and number of users that have contributed content around #ict4ag13. So if we look at this full picture, the headline figures are even more impressive, with over 12 million impressions, from 11,900 posts contributed by 1,272 users.

(click to enlarge image)
#ict4ag13 real-time tracker with Keyhole - 12 Oct-11 Nov. 2013
So what does it mean to reach these people? And do these numbers really matter?

On the one hand, if I look only at these figures, I am very pleased with the results - and I am indeed including these figures in my report back to CTA! And not just because I believe in ‘vanity metrics’ (in fact I don’t, as you’ll read) but for the reason that these numbers could be enough for me to say that the objective of the social reporting project - in terms of raising online awareness on the topics/sessions of the conference, update the online audience on the conference proceedings and engage them in the discussion - were met.

But I want to move beyond the numbers or - as Pete put it - "avoid the risk posed by pure, refined white sugar."

Keyhole itself provides some useful indicators that go in this direction. Indeed, it gives you insights on the contents that are shared (at the level of domains and individual links). It also provides a useful map that show where the conversation around the hashtag are happening, the demographics of contributors and the share of posts between original posts, retweets and replies.

To me these are already very interesting analytics and they provide a much richer picture of the online conversation around an event - and how users engage with it.

However, if you want to understand more about the value of social media for events and where different users position themselves on the ladder of engagement, especially when we look at Twitter conversations, I think different approaches are needed.

Tracking engagement, mapping conversations

While preparing for the event, I spent quite some time researching online for tools that could allow me to archive all the tweets around #ict4ag13, and to do some more analysis beyond the usual suspect metrics. I was looking for a free tool, and after several searches - and testing some applications - the right query string finally landed me to this post by Martin Hawksey, where he presents a way to archive AND visualize tweets, automatically pulling results from a Twitter Search into a Google Spreadsheet.

I am not going into the technical details of how this works (Martin does a pretty good job himself in explaining how to setup and use this script). As for the technical problems I have encountered, this will be in follow-up posts.

Again, I’d like to focus here on the results, and show you how different the picture looks like if you add another layer of analysis to the data available.

(click to enlarge image)



(Note - An interactive version of this visualization is available online but sometimes it may take long to load. You can hover over a node to see a summary of the data recorded in the archive. By clicking on the node you can see the conversations that person had condensed into 30 seconds)

As you can see in the image above, this visualizations shows the conversations between Twitter users around #ict4ag13 - producing a network analysis where the different nodes represent users and the connections between them is determined by the replies and conversations between them - instead of just retweets.

While this is probably not perfect - the visualization for example could be improved to have a more clear picture - I think this is very useful to evaluate the conversation around Twitter. In the case of the ICT4Ag conference, while many users contributed content, not all of them had been engaging in conversations - rather this happens amongst a core, central group of users, while many more remain at the periphery, broadcasting, engaging with the content of the conference but not really exchanging with one another.

Show off but connect the dots!

So while I am definitely using some ‘vanity metrics’ in my report back to CTA I think that a fair assessment of social media engagement around an event like in the case of ICT4Ag needs to look beyond just these numbers. It needs to map the conversations and the contents that are shared. Most important, once you have these information, it is critical to act on it and spend more attention into enlarging the conversation, to making sure that each contributor does not talk to himself but engage with others - or to put it another way, that each dot has at least a line that connects to it.

 -