AI again: Silva’s experience

Silva Ferretti, a colleague in international evaluation, has written an inspiring post on AI in evaluation that she has kindly allowed me to reproduce here. Sit back and enjoy the read!

>> I have been playing with Artificial Intelligence for some time now. I am amazed by it and actually surprised by the lack of debate regarding its role in development and humanitarian program management. Have I missed any discussions on this topic? If anyone has any information or pointers, I would greatly appreciate it. It is a game changer. We seriously should look into this NOW.

I learnt that:

It can write well-crafted logical frameworks and program concepts, as well as sectoral strategies, that are on par or even better than some real ones. It is able to anticipate risks and limitations, and propose detailed activities.
It is inclusive and politically aware, in a positive way. It has been trained to value inclusion and diversity, and is skilled at articulating ideas of participation and accountability, while also understanding that these ideas can generate conflict.
It is progressive and embraces a variety of methods and approaches. It can easily determine when rigorous/objective research is needed and when more constructivist methods should be used. It understands the advantages and areas of application for complexity-aware and feminist approaches.
It is creative and can use various communication styles. It suggested that conventional monitoring and evaluation methods may not be suitable for some programs and helped me generate anecdotes, commercials and even a rap song.
It excels at concepts, not facts. It does not provide references or links, and may sometimes confuse the names of standards or approaches. However, it understands the core concepts and can provide valuable insights. It is not a search engine, but a different paradigm.

What do I take from it?
1) the AI looks so good because a lot of developmental and humanitarian work is based on set approaches and jargon. We play by the book, when writing projects, when monitoring and evaluating change. This has advantages of course (we should not always reinvent the wheel!). But this is also where an AI works best. It is like these professionals good at making any project look cool, using the right words: nice, streamlined, even when reality is messy. And, sadly, what surfaces about many projects and programmes are just these sanitized proposals/reportings: confirmation of preset causal chains, with pre-set indicators… whilst local partners and change makers would tell more interesting and varied stories. It is the sanitized stories which eventually travels up the reporting chain, and into the AI of the future. This generates confirmation bias. And strengthens models accepted and established because we keep using them with the same lenses and logic. But reality is not like the blueprint.
2) the AI is more progressive than several professionals/institutions, in recognizing the whole field of complexity and complexity-driven approaches. Have a chat with it, asking what approaches are best in diverse contexts. It is adamant that participatory and empowerment processes require ad-hoc approaches. The lesson? That available evidence already indicates that there is not only one appropriate way to manage and evaluate (the bureaucratic/rigourous one). The fact that a machine understands the importance of the non quantifiable, of emergence, of feminist approaches – and some human managers don’t get it… – well, it makes me think a lot.
3) The AI can be really „creative“ when prompted. Try it out, and discover the many ways we could use to share the same concepts: poems, songs, riddles, conversations, anecdotes, stories. It is liberating, and a great way to free our own creativity and reach out to new audiences – when talking about change. It can add a whole new „communication dimension“ to monitoring, evaluation, and programming.
4) It is already happening. Artificial intelligence is not going to materialize in the far away future. You can do pretty decent work with it now. For routine tasks, including proposal writing, it is at least as good as a middle level officer needing supervision. How are we going to react? How should we use this tool? What will we teach to the next generation of professionals?

I am not afraid that AI can substitute humans. I am amazed, mesmerized by it. I find it stimulating. It provides, on a whim, the approach „by the book“ and free time to then go ahead, explore its details, the alternatives, the possibilities.
I fear that it can substitute humans as they are now used and considered by the development / humanitarian systems: bureaucrats applying tried and tested approaches, and having to confirm them using a standardized set of tools (which really do not allow to „go out of the box“). If this is the understanding, whole layers of professionals can be made redundant. Soon you can run your perfect programme, monitored by other AIs.
Which, to some extent, will (and probably also should!) happen.

What many people are already advocating re: monitoring, evaluation, learning is becoming more urgent. Do we have the tools to address the complexity ahead?
What is in that black box in between your question and AI response? How is the knowledge feeding it being built?
Our role will be, more and more, to check how ideas are formed, to explore mindsets behind the blueprints being formed, to check the boundaries of the issues we confront, to explore the links that are not yet there. Issues of interrelations, worldviews, boundaries, inclusion: it is the deep level use of systemic thinking that we should embrace, if we want to be enhanced, not trapped by the AI.

Am I optimistic? Not much, because the trend in the sector is for control, compliance, procedures. Even this list sometimes worries me when I see over and over the same debates about things that AI can already do better. But it consoled me that even the AI feels that humans still matter. In its words:

Humans have several unique abilities when it comes to understanding complexity that AI currently cannot replicate, here are some examples:

Empathy: Humans have the ability to understand and relate to the emotions, thoughts, and experiences of others. This allows them to understand the perspectives of different groups and to make decisions that take into account the needs and feelings of all stakeholders.
Intuition: Humans have the ability to make decisions based on their intuition or gut feelings, which can be particularly useful in complex and rapidly changing situations where there may be limited data or time to analyze information.
Creativity: Humans have the ability to think creatively and come up with innovative solutions to complex problems. They can think outside the box and come up with new and unexpected ideas.
Complex decision making: Humans can take into account multiple variables, such as cultural and political context, community dynamics, and the specific needs of different groups, and make decisions accordingly. They can also consider the ethical implications of their decisions.
Contextual understanding: Humans can draw on their experiences and knowledge to understand the specific context in which a humanitarian or development project is being implemented, and how it relates to the larger social, economic, and political environment.

It’s important to note that AI can assist in providing information and ideas, but the human ability to empathize, use intuition, think creatively, make complex decisions and contextual understanding are unique.In summary, my main function is to provide information and ideas, but the real work is done by the humans, who have the ability to understand the complexity of the situation, make decisions, and adapt to the context. Humans also have the ability to empathize and provide compassionate support, which is essential

All things that are recognized by complexity-driven, adaptive, participatory management practices. But are the development and humanitarian sector really run on this understanding?

Again… there are many urgent debates to have. And I would love to have pointers, resources, insights about them. Thanks in anticipation.

[If you wonder: I asked the AI if it agreed with the ideas shared in this mail „As a language model, I am not capable of having opinions or feelings. However, I can confirm that the statement you’ve provided is consistent with the capabilities that I have been trained on“. And, as I was there, I asked the AI to do some edits for readability – the initial bullet points. So, this mail is not co-written… but it is co-edited, discussed, validated. 🙂 ]

AI and evaluation

This year, there are lots of interesting discussions on the use of artificial intelligence in evaluation. I have started trying out ChatGPT to help me find out about mainstream information/opinion on the subjects I have worked on, develop workshop plans, and draft an email message. The results are interesting: at first sight, ChatGPT responses look quite good, but closer scrutiny makes them seem quite mediocre. That is normal, because ChatGPT relies on what is out there – all of it, or an average view of all of it, i.e., not the most advanced or the most exciting thoughts. Still, one can build on a medium useful response to develop something more suitable for the task at hand. An important first step is to question what the machine has produced and to reflect on what is missing in the machine generated answer. It could be the entire construct the answer is based on…

Regardless of more serious applications, ChatGPT has tricks to cheer up our work routines. This is what it has produced when I asked for a summary of principles of evaluation in the style of Emily Dickinson:

To assess, to know, to understand,
The ways of evaluation, so grand.
First, set clear goals for what you seek,
And measure progress, week by week.

Gather data, both near and far,
To see what’s working and what’s not par.
Look for patterns, trends, and clues,
And use this knowledge to inform your views.

Consider all perspectives, wide and narrow,
To understand the impact, deep and thorough.
Be open to feedback, both good and bad,
And use it to improve, like iron clad.

Celebrate successes, big and small,
And share them widely, one and all.
Acknowledge challenges, too, with grace,
And work together, to find a new pace.

In all your evaluations, always strive
To keep an open mind, and be alive.
For learning and growth, and progress to come,
Are the gifts of evaluation, to some.

Infographic summarising key findings on international advocacy

International advocacy

Last year I followed an exhilarating course on qualitative data visualisation with the PME Campus (in German). Among many other things, we learned how to design infographics using online platforms that make it really easy. For the example below, I first created a word cloud based on the short description of a funding scheme for international advocacy. Then, I uploaded the word cloud as a background for my infographic. A bit messy, but fun!

Infographic summarising key findings on international advocacy

International advocacy – an infographic

Last year I participated in a very enjoyable and informative course on qualitative visualisation with the PME Campus (in German). Among other things, we learned to design infographics. There are plenty of web-based platforms with a wealth of templates and elements for infographics. For the example below, I generated a word cloud based on the description of a funding scheme that supports international advocacy. Then I uploaded the cloud as the background to my infographic. A bit messy but fun!

Michaela’s first infographic

Feminist foreign policy and evaluation

DEval, the German Institute for the Evaluation of Development Cooperation, celebrated its 10th anniversary last night. It was a real-life event in a beautiful Berlin location bringing together an impressive crowd, including among others Svenja Schulze, our Federal Minister for Economic Cooperation and Development. One of the topics of her keynote speech was the current federal government’s commitment to feminist development policy. What does that mean for evaluation? Responding to a question by Minister Schulze, Jörg Faust, Director of DEval, came up with four aspects:

  • A Do No Harm/research ethics, e.g., by anonymising data about interviewees
  • Context-sensitive research
  • Evaluation design that ensures a wide spread of people are ‚appropriately heard‘
  • More diverse evaluation teams

While these elements definitely make good ingredients for a feminist approach to evaluation, I wonder what is feminist about it. Shouldn’t any evaluation tick all these boxes?

As the Federal Ministry for Economic Cooperation and Development (BMZ) puts it, „feminist development policy is centred around all people and tackles the root causes of injustice such as power relations between genders, social norms and role models.“ Let’s set aside this concept of ‚centering around all people‘ – I guess it only means that feminist policy is not for women only. Let’s look at the other half of the sentence. Wouldn’t that mean that evaluations should look into power relations and other (potential) root causes of gendered injustice, or at least examine whether and how projects have attempted to address those root causes? And what does it take for non-male people at the margins of society to be ‚appropriately heard‘? Won’t evaluators need to spend more time listening to more non-male people, in their own languages (btw. Translators without Borders appears to be doing a wonderful job on this)? Shouldn’t we have individual conversations not only with those that hold positions of power in a project, but also with intended ‚ultimate beneficiaries‘ of various backgrounds?

This is an aside, but an aside that is close to my heart. Often, I find it somewhat disrespectful and methodologically dodgy when evaluators organise group discussions for ‚grassroots‘ women to share how a project has changed (or not) aspects of their lives, while more privileged project stakeholders and external specialists are interviewed individually. Wouldn’t a feminist approach have to put this upside down, by inviting powerful people to reflect on project & context issues in focus groups, and organising individual interviews to learn about ‚grassroot women’s‘ personal experience in the project?

And, as evaluators, could we make a bigger effort to speak with women’s and lesbian, gay, bi, trans, intersex and queer (LGBTIQ) rights groups wherever we go, and generally identify more diverse experts for our key informant interviews? How about involving local/national/regional women’s and broader human rights experts and activists in the development of our data collection tools, in data analysis, and in crafting locally viable recommendations with a potential to transform power relations?

Sounds like this is asking too much? True, many evaluations I have come across (and I have seen many, in many roles) display only modest efforts to integrate gender and equity concerns, even though equity is part of the updated OECD-DAC effectiveness criterion for evaluation. Often, all you learn from such evaluations are the old messages that women and girls are worse off than the rest, and that social norms are to blame for that. Not very satisfying.

But there are evaluations out there, carried out by teams with a keen sense for rights-based work and power analysis, which have made the effort to reveal and test assumptions on gender roles underlying the programm logic. They have shown how a programme logic or theory of change that builds on a mistaken understanding of gender roles contributes to unwanted effects. That is the kind of finding that makes it into the executive summary of an evaluation report, and that is likely to open people’s eyes to the harm a conventional, gender blind approach to development can cause. Let’s not allow ‚feminist evaluation‘ to become a mere buzzword, or an excuse for wishy-washy methodologies. Let’s turn it into something meaningful that will yield new, potentially transformative, insights.

A real life workshop with a virtual facilitator

A few weeks ago I ended up as the virtual facilitator in a workshop that everybody else attended in ‚real life‘, at a pleasant venue in the countryside – and it worked out nicely! Here is how we went about it. Spoiler: Sunshine and plenty of greenery have played an important part.

The planned workshop was supposed to happen at a lovely place in the countryside, on a sunny late summer day. I was looking forward to enjoying being there, and working with a group of people who had hired me as an external facilitator. Then, two days before the workshop, COVID-19 arrived at my household. I was fine, testing negative, but my client felt it was safer I’d stay away from workshop venue. We had to regroup and reorganise, both on the human and the technical front:

On the human side, I needed a pair of eyes and ears in the room. We appointed a participant who would be my connection to „the room“ (that is how facilitators sometimes call the group they work with). That turned out to be essential, not only because ‚the room‘ was outdoors and all over the place. We agreed that the co-facilitator would devote most of her attention to her co-faciliating role, which involved not only eyes and ears, but also hands-on management of the participants‘ verbal contributions.

At the physical venue, there was something they called a ‚tower‘ – basically, a webcam and a multidirectional microphone on top of a set of speakers. When people took turns speaking, it worked well enough, but I could not see more than a fifth or a quarter of the actual participants. There was also a projector that initally beamed my face onto a videoconference screen – I quickly added an online whiteboard where I summarised key points on virtual post-its (instead of posters in the room).

Most importantly, there was the wonderful countryside outside. It had been my plan to organise plenty of small group work, anyway – so, for most of the day, I invited the participants to wander off in random or purposefully composed duos and trios and quartets to work in the vast outdoor space. During small group work, the co-facilitator would walk around, listen in here and there, and ring me up with information as to how the groups were doing and what subsequent steps would make sense. After each small group session (varying from 15 minutes to an hour or so), the participants came back to the conference room to share key conclusions, which I recorded on the virtual whiteboard, before sending them off again with new small group assignments (in varying groups).

Near the end of the day, there was a strong feeling that one issue needed plenary discussion – again, I decided to relinquish control and make use of the outdoor space. I provided only simple rules for the discussion that would allow every participant to speak up in a calm atmosphere, and asked the co-facilitator to remind participants of the rules if needed. (Hint: I use rules inspired by Nancy Kline’s Time to Think.) After an hour, everybody came back to the webcam, seemingly refreshed – which is extremely unusual for a long workshop day! – and equipped with important insights.

Would I do it again? With a co-facilitator, OK equipment and a pleasant space for the participants, absolutely!

Everyday evaluation template

The evaluation budget is too small to give serious attention to the 45 evaluation questions you are supposed to answer within four weeks? Hanneke de Bode has the solution! She has shared a long rant about the contentious power of evaluations on a popular evaluation mail server.

Hanneke contributed to a discussion about the lack of published evaluations commissioned by non-governmental organisations (NGOs). Arguably, one reason is the limited quality one can achieve with often very limited resources for smaller evaluations. She has made such a beautiful point that I am not the only one sharing this on my blog – our much-esteemed colleague Jindra Cekan is also going to spread it across her networks, with Hanneke’s kind permission. And here comes Hanneke’s 101 for small evaluations! Does that ring a bell?

Most important elements of a standard evaluation report for NGOs and their donors, about twenty days of work about 20.000 € (VAT included)

In reality, the work takes at least twice as much time as calculated and will still be incomplete/ quick and dirty because it cannot decently be done within the proposed framework of conditions and answering all 87 questions or so that normally figure in the ToR.


The main issues in the project/ programme, the main findings, the main conclusions and the main recommendations, presented in a positive and stimulating way (the standard request from the Comms and Fundraising departments) and pointing the way to the sunny uplands. This summary is written after a management response to the draft report has been ‘shared with you’. The management response normally says:

  • this is too superficial (even if you explain that it could not be done better, given the constraints);
  • this is incomplete (even if you didn’t receive the information you needed)
  • this is not what we asked (even if you had agreement about the deliverables)
  • you have not understood us (even if your informants do not agree among themselves and contradict each other)
  • you have not used the right documents (even if this is what they gave you)
  • you have got the numbers wrong; the situation has changed in the meantime  (even if they were in your docs)
  • your reasoning is wrong (meaning we don’t like it)
  • the respondents to the survey(s)/ the interviews were the wrong ones (even if the evaluand suggested them)
  • we have already detected these issues ourselves, so there is no need to put them in the report (meaning don’t be so negative)

Who the commissioning organisation is, what they do, who the evaluand is, what the main questions for the evaluators were, who got selected to do this work and how they understood the questions and the work in general.


In the Terms of Reference for the evaluation, many commissioners already state how they want an evaluation done. This list is almost invariably forced on the evaluators, thereby reducing them from having independent status to being the ‘hired help’ from a Temp Agency:

  • briefings by director and SMT members for scoping and better understanding
  • desk research leading to notes about facts/ salient issues/ questions for clarification
  • survey(s) among a wider stakeholder population
  • 20-40 interviews with internal/ external stakeholders
  • analysis of data/ information
  • recommendations
  • processing feedback on the draft report

In the Terms of Reference, many commissioners already state which deliverable they want and in what form:

  • survey(s)
  • interviews
  • round table/ discussion of findings and conclusions
  • draft report
  • final report
  • presentation to/ discussion with selected stakeholders

Many commissioners send evaluators enormous folders with countless documents, often amounting to over 3000 pages of uncurated text with often unclear status (re. authors, purpose, date, audience) and more or less touching upon the facts the evaluators are on a mission to find. This happens even when the evaluators give them a short list with the most relevant docs (such as grant proposal/ project plan with budget, time and staff calculations, work plans, intermediate reports, intermediate assessments and contact lists). Processing them leads to the following result:

According to one/ some of the many documents that were provided:

  • the organisation’s vision is that everybody should have everything freely and without effort
  • the organisation’s mission is to work towards having part of everything to not everybody, in selected areas
  • the project’s/ programme’s ToC indicates that if wishes were horses, poor men would ride
  • the project’s/ programme’s duration was four/ five years
  • the project’s/ programme’s goal/ aim/ objective was to provide selected parts of not everything to selected parts of not everybody, to make sure the competent authorities would support the cause and enshrine the provisions in law, the beneficiaries would enjoy the intended benefits, understand how to maintain them and teach others to get, enjoy and amplify them, that the media would report favourably on the efforts, in all countries/ regions/ cities/ villages concerned and that the project/ programme would be able to sustain itself and have a long afterlife
  • the project’s/ programme’s instruments were fundraising and/ or service provision and/ or advocacy
  • the project/ programme  had some kind of work/ implementation plan


This is where practice meets theory. It normally ends up in the report like this:

Due to a variety of causes:

  • unexpectedly slow administrative procedures
  • funds being late in arriving
  • bigger than expected pushback and/ or less cooperation than hoped for from authorities- competitors- other NGOs- local stakeholders
  • sudden changes in project/ programme governance and/ or management
  • incomplete and/ or incoherent project/ programme design
  • incomplete planning of project/ programme activities
  • social unrest and/ or armed conflicts
  • Covid

The project/ programme had a late/ slow/ rocky start. Furthermore, the project/ programme was hampered by:

  • partial implementation because of a misunderstanding of the Theory of Change which few employees know about/ have seen/ understand, design and/ or planning flaws and/ or financing flaws and/ or moved goalposts and/ or mission drift and/ or personal preferences and/ or opportunism
  • a limited mandate and insufficient authority for the project’s/ programme’s management
  • high attrition among and/ or unavailability of key staff
  • a lack of complementary advocacy and lobbying work
  • patchy financial reporting and/ or divergent formats for reporting to different donors taking time and concentration away
  • absent/ insufficient monitoring and documenting of progress
  • little or no adjusting because of absent or ignored monitoring results/ rigid donor requirements
  • limited possibilities of stakeholder engagement with birds/ rivers/ forests/ children/ rape survivors/ people in occupied territories/ murdered people/ people dependent on NGO jobs & cash etc.
  • internal tensions and conflicting interests
  • neglected internal/ external communications
  • un/ pleasant working culture/ lack of trust/ intimidation/ coercion/ culture of being nice and uncritical/ favouritism
  • the inaccessibility of conflict areas
  • Covid

Although these issues had already been flagged in:

  • the evaluation of the project’s/ programme’s first phase
  • the midterm review
  • the project’s/ programme’s Steering Committee meetings
  • the project’s/ programme’s Advisory Board meetings
  • the project’s/ programme’s Management Team meetings

very little change seems to have been introduced by the project managers/ has been detected by the evaluators.

In terms of the OECD/ DAC criteria, the evaluators have found the following:

  • relevance – the idea is nice, but does it cut the mustard?/ others do this too/ better
  • coherence – so so, see above
  • efficiency – so so, see above
  • effectiveness – so so, see above
  • impact – we see a bit here and there, sometimes unexpected positive/ negative results too, but will the positives last? It is too soon to tell, but see above
  • sustainability – unclear/ limited/ no plans so far

If an organisation is (almost) the only one in its field, or if the cause is still a worthy cause, as evaluators you don’t want the painful parts of your assessments to reach adversaries. This also explains the vague language in many reports and why overall conclusions are often phrased as:

However, the obstacles mentioned above were cleverly navigated by the knowledgeable and committed project/ programme staff in such a way that in the end, the project/ programme can be said to have achieved its goal/ aim/ objective to a considerable extent.


Most NGO commissioners make drawing up a list of recommendations compulsory. Although there is a discussion within the evaluation community about evaluators’ competence to do precisely that, many issues found in this type of evaluation have organisational; not content; origins. The corresponding recommendations are rarely rocket science and could be formulated by most people with basic organisational insights or a bit of public service or governance experience. Where content is concerned, many evaluators are selected because of their thematic experience and expertise, so it is not necessarily wrong to make suggestions.

They often look like this:

Project/ programme governance
  • limit the number of different bodies and make remit/ decision making power explicit
  • have real progress reports
  • have real meetings with a real agenda, real documents, real minutes, real decisions and real follow-up
  • adjust
  • communicate
Organisational management
  • consult staff on recommendations/ have learning sessions
  • draft implementation plan for recommendations
  • carry them out
  • communicate
Processes and procedures
  • get staff agreement on them
  • commit them to paper
  • stick to them – but not rigidly
  • communicate

Obviously, if we don’t get organisational structure and functioning, programme or project design, implementation, monitoring, evaluation and learning right, there is scant hope for the longer-term sustainability of the results that we should all be aiming for.

Less is more in evaluation questions

I am republishing this 2019 post because of a recent, heated discussion on a popular evaluation list server. It is about the harmful impact of excessive evaluation questions on evaluation quality.

Writing evaluation terms of reference (TOR) – that is, the document that tells the evaluators what they are supposed to find out – is not a simple exercise. Arguably, the hardest part are the evaluation questions. That section of evaluation TOR tends to grow longer and longer. This is a problem because: Abundant, detailed evaluations questions may lock the evaluator into the perspective of those who have drawn up the TOR, turning the evaluation into an exercise with quite predictable outcomes. That limits learning opportunities for everyone involved.

Imagine you are an evaluator who is developing an offer for an evaluation, or who is working on an inception report. You sit at your table, alone, or with your team mates, and you gaze at the TOR page (or pages) with the evaluation questions. Lists of 30-40 items totalling 60-100 questions are not uncommon. Some questions may be broad – of the type, „how relevant is the intervention in its context“, some extremely specific, for instance, „do the training materials match the trainers‘ skills“. (I am making these up but they are pretty close to real life.) While you are reading, sorting and restructuring the questions, important questions come to your mind that are not on the TOR list. You would really like to look into them. But there are already 70 evaluation questions your client wants to see answered and the client has made it clear they won’t shed a single one. There is only so much one can do within a limited budget and time frame. What will most evaluation teams do? You bury your own ideas and you focus on the client’s questions. You end up carrying out the evaluation within your client’s mental space. That mental space may be rich in knowledge and experience – but still, it represents the client’s perspective. That is an inefficient use of evaluation consultants – especially in the case of external evaluations, which are supposed to shed an independent, objective or at least different light on a project.

Why do organisations come up with those long lists of very specific questions? As an evaluator and an author of meta-evaluations based on hundreds of evaluation reports, I have two hypotheses:

  • Some evaluations are shoddy. Understandably, people in organisations that have experienced sloppy evaluations wish to take some control of the process and they don’t realise that tight control means losing learning opportunities. But! It takes substantial evaluation experience to provide meaningful guidance to evaluators – where evaluation managers have limited experience in the type of evaluation they are commissioning, their efforts to take control can be counter-productive.
  • Many organisations adhere to the very commendable practice of involving many people in TOR preparation – but their evaluation department is shy about filtering and tightening the questions, losing an opportunity to shape them into a coherent, manageable package.

What can we do about it? Those who develop TOR should focus on a small set of central questions they would like to have answered – try to remain within five broad questions and leave the detail to be sorted during the inception phase. Build in time for an inception report, where the evaluators present how they will answer the questions, and what indicators or what guiding questions they’ll use in their research. Read that report carefully to see whether it addresses the important details you are looking for – if it doesn’t and if you still feel certain details are important, then discuss them with the evaluators.

My advice to evaluators is not to surrender too early – some clients will be delighted to be presented with a restructured, clearer set of evaluation questions. If they can’t be convinced to reduce their questions, then try to find an agreement as to which questions should be prioritised, and explain which cannot be answered with a reasonable degree of validity. This may seem banal to some among you – but to tell from many evaluation reports in the international cooperation sector, it doesn’t always happen.

Five tips for remote facilitation

This is a rerun of a blog post I wrote a year before I started running training workshops on online facilitation (with the PME Campus, for example). All of what I wrote then is still valid. Since I have promised I would move some posts from my old blog to this new one, here is the post:

Despite the risks and uncertainties associated with independent consulting, I have never felt as privileged as I do now, living in a country with a highly developed, accessible health system, working from my customary home office, and equipped with a decent internet connection and the hardware needed to stay in touch with friends and colleagues. The crisis has been an opportunity to develop my remote facilitation skills. Before, I facilitated the occasional „real-life“ workshop in a conference room with video equipment, with participants in other locations joining us via Skype or the like. I have shared that type of hybrid experience on the Gender and Evaluation community pages. Now I have gone one step further, facilitating fully remote workshops from my home office. I mean interactive workshops with some 5-20 people producing a plan, a strategic review or other joint piece of work together – not webinars or explanatory videos with hundreds of people huddling around a lecturer who dominates the session. To my delight, virtual facilitation has worked out beautifully in the workshops I have run so far. Good preparation is a key element – as in any workshop. I have distilled a few tips from my recent experience and from the participants‘ feedback.

  • Plan thoroughly and modestly. Three to four hours per workshop day is enough – and there is only so much you can do in half a day. Factor in breaks (at least one per hour), time for people to get into and out of virtual breakout rooms, and at least five minutes per workshop hour for any technical glitches.
  • Try to make sure all participants can see each other’s faces. Some videoconferencing platforms allow you to see dozens of participants on the same screen. If you use a platform that shows only a handful of speakers, try to rotate speakers so that everyone can catch a glimpse of every participant. Apparently, recent research shows that remote meetings are more effective if people see each other. Smile! Keep interacting with your webcam and watch participants‘ faces as carefully as you would if you were in a room with them.
  • Pick facilitation tools that match your participants‘ digital skills. I love software that allows everyone to post „virtual“ sticky notes and move them around on a shared whiteboard. But that’ll work only if all (or a critical mass of) participants like experimenting with web-based tools. If many participants are uncomfortable with collaborative web-based visualisation, then you can record key points on the virtual whiteboard (life or between sessions), or ask participants to send their text contributions to you or your co-facilitator to post them on their behalf. The best way to gauge participants‘ readiness is a technical rehearsal well before the workshop (ideally, at least a week earlier).
  • Share a written technical briefing before the workshop. That should include (i) the links and passwords to the conference and the tools, (ii) guidance as to how to maximise data transmission speed – for instance, by using a LAN cable or by switching off WIFI on all non-necessary devices, by temporarily disabling Windows updates, closing all other computer windows etc., (iii) guidance on troubleshooting in case of major technical problems (e.g. alternative dial-in numbers, persons to contact if a participant fails to get back on-line), and possibly (iv) links to a couple of very short (1-2 minute-) tutorials for any software you may use for web-based joint visualisation or other forms of co-creation.
  • Do your homework. And give homework. If the digital tools you’ll use are new to you, try them out with colleagues and friends before the actual workshop. There is a growing body of video tutorials on the sprawling world of virtual collaboration; check out these resources. I also like quick primer for running online events on Better Evaluation which contains plenty of useful links. Before and in-between workshops, invite participants to try out any tools that are new to them, and/or to continue working on the collaborative virtual whiteboard.

It is generally recommended to work as a tandem, with one facilitator running the workshop and the other one looking after the technical aspects. But if you facilitate only one to two three- to four-hour sessions a week and you type really fast, then you can manage on your own. Be prepared, though, to feel totally exhausted after each session!

Take time when preparing (for) evaluations

In 2012, I published a post with the title above on my former blog. And I still see major evaluations with budgets running into hundreds of thousands of euros that come with a four-week inception phase, or that are supposed to start basically the day after the evaluation firm or evaluator has been selected. That is wasteful, because an evaluation that is not tailored to its users‘ needs risks being… useless.

Ideally, one should start planning evaluations right when the project/programme that is supposed to be evaluated starts. Back in 2012, I recommended to start recruiting evaluation teams at least six months ahead of the field work – at that time, the evaluations I had on my mind were evaluations of individual projects run by civil society organisations (CSOs). With anything bigger or more complicated, I’d plea for much, much more time for finding the evaluation team, briefing it and developing a robust evaluation design with instruments that fit their purpose. But the gist of my 2012 blog is still valid – and I had promised to re-publish a few of my earlier posts. Here it is:

There has been an extraordinary flurry of calls for proposals for external evaluations. This is good news; it suggests that people find it important to evaluate their work. But, upon closer examination, you’ll notice that many calls envisage the evaluations to begin just a couple of weeks after the deadline for offers, and to end within a month or so. That is frustrating for experienced consultants, who tend to be fully booked several months ahead. Narrow time-frames may also make it difficult for those who commission the evaluation to identify sufficiently skilled and experienced candidates. If you take evaluation seriously, then surely you want it to be done in the best possible way with the available resources?

Over the years, I have come to appreciate time as a major element of evaluation quality. Most development organisations (not only CSOs) cannot and do not want to afford full-fledged scientific-quality research, which typically involves plenty of people with advanced academic degrees and several years of research. That is perfectly reasonable: if you need to make programme decisions on the basis of evaluations, you can’t afford to wait for years. (The programme would be over, the context would have changed, your organisation would have changed their priorities, to quote but a few likely problems.) But what one can afford – even on a shoestring budget – is to allow plenty of time for thinking and discussing during the preparatory phase of an evaluation. In that way, you can make sure (among other things):

  • the terms of reference (TOR) express exactly what you need
  • the participating organisations are well-prepared and welcoming (which they are more likely to be if the TOR have been worked out with them and take their wishes into account)
  • the evaluation team understands what they are supposed to evaluate
  • the evaluators can reflect on different options, discuss these with key evaluation stakeholders, and let their thoughts mature over a few weeks before deciding on the final design
  • there is enough time to sample sites & projects/components so as to achieve a maximum of representativeness or a good choice of cases – to avoid visiting only what a Chinese expression calls „fields by the road“
  • data collection tools can be pre-tested, adjusted, those collecting the data trained and so forth

Extra time for these activities does not necessarily mean more consulting-days – just spreading out the days budgeted for, and identifying and finding ways of making better use of existing data in the project, can make a big difference.