Data and the SDGs: From Opportunities to Impact

21 December 2017 - A Open Forum on Other in Geneva, Switzerland

Also available in:
Full Session Transcript


>> NADIA ISLER:  Good morning, everyone.  Welcome to this cozy session on the Eve of the holiday.  I thought I would come and shake hands to know who is in the room so we can also maybe cater the conversation afterwards according to your specific interests. 

 

 

So thank you for being here.  This is really ‑‑ first of all I'm Nadia, the director of the SDG lab which is a recently created unit which sits in the Office of the Director General of the United Nations here in Geneva.  My team is here also.  What we do in a nutshell because I could talk about it for hours for the lab, but it is helpful for you to know what we are doing and why I am here.  We are basically capitalizing and maximizing what international Geneva has to offer as a multi‑stakeholder platform and space and support in the implementation of the SDGs.  And we look at specific global, regional and national challenges in SDG implementation.  The theme of data is a big one out there.  How to tackle and what the added value of the lab is in this mega sphere is still something that we are trying to pin down. 

 

 

But chairing this meeting today is an opportunity for us to hear voices of experts on the matter to really understand more in‑depth the challenges linked to data, Big Data collection and how it can accelerate the implementation of the SDGs and look in to what the opportunities and also what the challenges are. 

 

 

So we have a very short session.  I will have to close it at 10 passed 11.  So I'll be extremely short in my introductory remarks because I would like to give space to Barbara and Rosy and Linus and John time to share.  If we all ask each other what part of the data puzzle you are working on or what is in your view the most important about data and the SDGs, we will all have a different angle.  Yet we talk about data as if it is like one homogeneous bubble. 

 

 

So today the intention of this meeting is to look at the data equation through different lenses, through the speakers on the panel.  And then this is just the beginning of a conversation.  I think we will have maybe some key questions coming out of this session which the SDG lab would be happy in partnership with others to tackle at another level, including also with Member States and the broader Geneva community. 

 

 

So as we know I mean data is being used and has been used in the past and will continue to be used for policy, policy making, for orienting strategic decisions.  And it is also used for the monitoring of the implementation of the agenda.  So as we know there are great opportunities and specifically now with all the different tools and technology related to Big Data, we know that there are new doors open in terms of capturing elements we had never had a chance to capture before.  So the opportunities are absolutely huge, but the challenges are also emerging.  We hear lots of recurring questions, what data are we collecting or for what use.  How is data collected.  And how is data analyzed.  Who is collecting the data.  It is often also a concern that we hear over and over again from many Member States.  As well as how it is collected by those institutions or people organizations. 

 

 

And then how do we actually format data in such a way that it really does help policy making.  We often have Member States dropping by the SDG lab telling us how much they appreciate the opportunities of new data out there, but it is also overwhelming.  And they dream of having data collected and analyzed for them so that they can make sound policy making ‑‑ policy decisions.  But not just ‑‑ having more data is sometimes more daunting than anything else. 

 

 

So how do we grapple with that challenge, too.  So today we have four speakers.  Barbara Rosen from the DiploFoundation.  Linus Bengtsson, executive director of Flowminder Foundation.  Rosy Mondardini, managing director of the ETH/UZH Citizens Science Center.  And John Crowley, manager of knowledge and learning at IFRC. 

 

 

So I would first like to give the floor to Barbara.  I think will ‑‑ you will all be making short presentations, slide presentations.  So you have five minutes each to share with us which part of the data puzzle you belong to and how you are tackling the equation of how data could best be used to accelerate the 2030 Agenda.  We are on Webcast.  We have quite a few people following us on the Internet.  And we have re ‑‑ remind me your name?  Katharina is the Moderator of the session online.  We may be getting questions from people out there.  Barbara, the floor is yours for five minutes. 

 

 

>> BARBARA ROSEN:  Thank you, Nadia.  I will just first start with a very brief kind of overview or context in which we are discussing this.  The importance of data of modern society is very obvious.  And it is also obviously present in the agenda of the IGF of today, of this week.  More and more data is generated, aggregated in Big Data sets.  And this has given a lot of opportunity to make better informed policies and better target people in need.  And in parallel it has provided quite some demands to the monitoring of progress, especially in development  because of the realization that more might be possible.  And in this context the adoption of the Sustainable Development Goals in 2015 went hand in hand with 169 targets and 232 indicators.  With this awareness that modern goals needs a modern measurement system. 

 

 

So with the many targets and indicators there is this idea that the SDGs should serve to leave no one behind.  Data is important to fill kind of gaps of vulnerable communities that we cannot detect without very, very accurate and big and also disaggregated data so that we know whether they are embraced by development efforts or not. 

 

 

So with this background I conducted a little study over the summer related to the High‑Level Political Forum.  And the High‑Level Political Forum takes place every year in New York and discusses the ‑‑ discusses the sustainable development and progress towards the SDGs.  And I analyzed in which context and how much the word data was mentioned in reports by IISD and the United Nations.  So I also analyzed this over the years.  I started with reports from 2014 and indeed what I thought was confirmed, data is increasingly mentioned in the debates around the SDGs.  But what is maybe more interesting is I looked at how data has been talked about in discussions on the SDGs.  And I find out that there has been an overwhelming focus on disaggregation.  Data disaggregation was the main context in which data ‑‑ the word data was used. 

 

 

At the same time there was some cog on availability of data and the need to collect data, but other topics surrounding data were not often mentioned.  And this raises the question if disaggregation is so important how do we get this right and mitigate challenges to disaggregate data.  Related challenges are, for example, capacity building, because of the enormous dimension of targets and indicators that for which data needs to be collected.  Capacity building is very, very important.  And there is only very slowly growing awareness of the capacity that is needed.  And I think there is a little bit of a danger that we have reliance on third parties by countries that needs to do reporting, there needs to be a certain basic level of capacity to understand whether these analyses are accurate and what kind of data is used, is the data relevant for the use of monitoring the SDGs, especially when talking about new data. 

 

 

An additional challenge not often talked about is the privacy in the Sustainable Development Goals and especially if there is a focus on disaggregated data and the increased collection of data along ethnic lines or gender or age, geographic location.  The question arising whether in parallel this might give rise to increasingly sensitive data being collected and how do we protect it and know that this data is used for good and for the Sustainable Development and not against it.  I think in some ‑‑ there is a lot of ground to gain to capture really the benefits of new kinds of data for the SDGs.  But we need to have an informed discussion on how to materialize this in practice.  And with these enormous data needs Big Data and crowdsource and citizen and open data can be of true help if they are used intelligently and sustainably and responsibly. 

 

 

And I'm also looking forward to hearing from my three copanelists more about each of these areas.  But I find out that there is already a very great awareness that data and Big Data and new forms of data can have a lot of potential.  And now the real opportunity is to kind of turn this hype in to real action and real impact. 

 

 

>> NADIA ISLER:  Thank you very much, Barbara.  I think this was a good opening, opening sort of a presentation of the big sort of framework in which we are operating.  And thank you for highlighting the challenges related to disaggregation.  As you mentioned indeed in New York of the national voluntary presentations made by Member States underscored that challenge of how we do that best.  I would like to give the floor to the Linus Bengtsson. 

 

 

>> LINUS BENGTSSON:  Thank you.  Great, Barbara, update, disaggregation.  And I don't think this was planned in any way but that's exactly what we focus on.  So that was great.  I'll give a bit of an overview of who we are and how we approach the questions of measuring and monitoring the SDGs.  So we are non‑profit.  We are coming from the academic sector, most of us.  We are about 60 staff, most based in the UK but we have an office here as well.  We partner with a lot of private data providers and other public body data providers.  And as well with partner agencies, international agencies and Governments and support them with robust analysis of new types of data focusing really on old problems but leveraging all the new statistical methods and the new data that is available there. 

 

 

We publish everything we do in peer reviewed journals.  And I think that's ‑‑ it is a very new field.  There is a lot to explore.  And it is very easy to make nice visualizations which looks perfect but actually they don't correspond to the reality on the ground.  And I think this is key ‑‑ a key point I want to make.  We need to validate and we need to use really robust methods. 

 

 

So a few words about the data sources.  We use these fancy new data sources.  And this is the most important one.  Traditional household surveys and census data.  Exactly asking questions about their lives, exactly what we want to measure.  Then we have the fancy stuff.  So we pioneered the use of mobile operator data back in 2008.  And we have been working since in many countries using this for humanitarian and development purposes.  And in addition we use a huge number of geo spatial layers.  Basically maps of something.  So you see on the top there it is a ‑‑ it is a mobile data.  But to the right we have topography and we have the household surveys and temperature and we have slope.  We have ‑‑ tons of information we can now gather from space.  And each of them is very biassed as a proxy for what we actually want to capture.  So the whole problem we approach how should we use these data sources in a way that adjusts for the biases and really take out the information. 

 

 

So this is what we do.  We produce high resolution, disaggregated maps of population densities, characteristics, and their dynamics.  So first of all, we produce for all low and middle income countries the data per 100 times 100 meters of the number of people living there. 

 

 

So this is open free data.  You go to a website and download and use however you want.  Secondly we profile the characteristics of these populations.  So for all the data showed there before we have age and gender disaggregation but we are also doing much more sophisticated work.  And so as an example here female literacy in Nigeria.  We know from traditional service, great household surveys, measure literacy but the ‑‑ this is the disaggregated distribution among females.  Better in the south than north.  We then combine these household surveys and these new fancy data sources with robust statistical methods.  And we produce these types of maps.  So per square kilometer the proportion of females being literate in Nigeria and this enables you to take decisions and prioritize resources in a whole different way. 

 

 

And there is a lot of work behind this map.  And I have probably one minute left.  So but feel free to contact us. 

 

 

This is poverty in East Africa and we do this for a lot of indicators.  Finally, we have dynamics and changes and population distributions.  And this is migration patterns in Bangladesh.  You see based on 40 million mobile phones with Gramine phone in Bangladesh.  And the phones by themselves have quite a bit of bias.  And it is a big job to develop methods, how to account for that.  But we use this for understanding how population densities change with the seasons and weeks and over time.  And maybe this is the most striking example of when we have huge redistributions of the population and really show the value of the mobile data.  So this is ‑‑ we work in a lot of disaster settings.  So these are among the people living in the Katmandu valley before the earthquake.  This is the distribution after the earthquake.  We can see on a national level how this outflow of earthquake looked.  So I think that was my ‑‑ just to say finally that these data, they are now finding their way in to very important big UN international reports.  And so we ‑‑ we do ‑‑ it is no longer just a scientific exercise.  That was the last one I think. 

 

 

>> NADIA ISLER:  Wonderful.  It is a pleasure to be chairing a meeting where the speakers are so disciplined.  And your last slide kind of answered the question I had in my mind, so where do this disaggregated data you are working on, where do they land.  Do you go a step further in to advising, for example, ministries, et cetera, on choices, strategic choices and Plenification?  And we can discuss that afterwards in the Plenary.  Thank you very much.  I would like to give the floor now to Rosy, the managing director of the Citizen Science Center.  Tell us more. 

 

 

>> ROSY MONDARDINI:  I will talk about data generated by Citizen's Science projects and the role they can have for the Sustainable Development Goals.  So Citizen's Science ‑‑ okay.  Citizen's Science is a practice basically where citizens and science collaborate to do scientific research.  And most of the time the scientists design the project.  They come up with scientific questions.  The citizens collect data.  And then the scientists do the analysis.  While more and more we actually see scientists that are participating as well in to the analysis phase and sometimes really participating in the whole process.  This has been around as a methodology for ages.  And citizens have helped astronomers, biologists in projects that go from classifying galaxies to checking the quality of water, to monitoring the forestation. 

 

 

Who are these people?  These are people that come any age, any background, every part of the world and why do they do that?  There are a lot of studies on that, but mainly because they want to participate in the advancement of science or because they are really passionate about the topic or because the topic is very close to them.  Maybe projects that have to do with health issues or projects for the environment and biodiversity.  So there are different ways that nowadays these ‑‑ I go to the next one.  So Citizen's Science is actually these projects are called in different ways nowadays, crowdsourcing, do it yourself science.  These are slightly different approaches.  For instance, do it yourself is when citizens also build the instrument and the tools to collect the data.  But they are all based on the same general principles that citizens are at the heart of activity and of the data collection.  So I will give you just three examples of such projects to give you an idea of where these kind of projects can go. 

 

 

So the first one is safeguards.  This was ‑‑ is now an NGO that was created after the big earthquake in Japan in 2011.  And especially after the Fukushima explosions.  People wanted to know the level of radiation in their own house and streets while the data from the Government were not that fine grained.  So basically they thought okay, let's build a Geiger counter to counter the radiation and let's give them to the people.  So in the time of a month they had the first prototype, this open source Geiger detector.  And since then, that moment up to now they provide the more detailed and precise map of radiation level in Japan.  And their data, their operation has been recognized by authorities and by the scientific community as well.  They have been invited to present their data at the International Atomic Agency conference.  

 

 

Second example, what am I doing here?  The second ‑‑ yeah.  Okay.  For data.  It is a different way to involve citizens.  It is a game where you can fold in the three‑dimensional space proteins.  Remember that the folding of protein is very important because basically determines the function of the protein.  And scientists have been studying the structure of the protein that underlies the HIV enzyme for 15 years without any result until they decided to use this game and they gave the problem to the gamers, 230,000 gamers.  And guess what, I mean for problems took 15 years to scientists not to be solved.  How long did it take to citizens?  Ten days.  So in ten days they solved something that the scientists could not.  And these were kids playing a game.  And in the publication of nature that follow the discovery there are 51,000 folded gamers that are recognized and they are credited with coming up with a structure that outforms anything that could be done with computers. 

 

 

So the third example is the humanitarian open street map team.  Every time you have a natural disaster having, you know, very detailed and up‑to‑date maps of the region where the disaster happened is very important.  And this is what this team does.  Every time anything happens, anywhere they rally a big network of volunteers that go on the Web, go online and basically analyze the images coming from satellite of the specific area.  Maybe they do ‑‑ they look at the images before the event, after the event and very patiently track buildings and infrastructures.  And they spot schools.  They spot broken bridges or damaged buildings.  These are just ‑‑ and by the way, this is not only used for disaster relief but they have also been ‑‑ the volunteers have been asked to map the population as well in different areas.  Because these kind of mapping is very important everywhere, anywhere there is a disaster.  But especially in the more poor and vulnerable areas, those areas often are not even in the map.  So the information that a volunteer provides is the only one existing.  So three examples.  I could go on and give you, you know, hundreds more honestly in many different fields.  But all ‑‑ my point was data generated by citizens not only can be used to fill the gaps in science, but they also provide substance for to making informed decisions to encourage self‑determinations of people and community and, of course, to support monitoring and accountability in the context of the SDGs. 

 

 

>> NADIA ISLER:  Thank you very much, Rosy.  This is sort of a new dimension I think for many of us to bring in the role of citizens in data, data collection.  And I was struck by your last point also how the crowdsourcing of data collection can also help to reach the most vulnerable populations as we ‑‑ Barbara was mentioning how can data help us to reach people who are left behind.  I think you gave a very concrete example of how that was done.  I have questions on the validation of this data, et cetera, is that being challenged because it is collected by citizens or not.  Or the fact that you are working also with scientists, does that give creditability to the data.  Maybe we can address those questions later.  I would first like to ‑‑ I would now like to give the floor to John from the International Federation of the Red Cross and Red Crescent Societies.  John. 

 

 

>> JOHN CROWLEY:  Thank you.  It is wonderful to see how hot up there it is.  We helped create it, a number of us.  Clicker would be great.  Thank you. 

 

 

In the process of creating this as really as important to realize that because reality is changing so fast, because cities are changing so fast the data that we are collecting are really describing a dynamic reality.  What I am trying to do is to go through really quickly a practical approach and give a couple of examples of how open data is critical for building out or understanding not only monitoring the SDGs but achieving them. 

 

 

First is just a finding of open data.  Ultimately data is open where two things are true.  One it has to be legally open.  The license on the data has to be set so that you can redistribute it.  There are a number of different ways of doing this, but that legal element is one that's often missed.  And secondly has to be technically open.  The format has to be something that is humanly readable, machine readable and we can begin to use it in multiple ways.  It is not in a proprietary format that makes it difficult. Many of us are looking at datasets that come from national level collections, census data.  Maybe we get down to admin 1.  Maybe admin 2. 

 

 

But at what level of resolution is it necessary to make decisions.  And that's really the key element of both the open data collection from the crowdsource as well as getting Government data released out for use within the SDG context.  Even if we have really high resolution data for one dataset, if it is now being combined with other datasets that are really course, it still leads to a decision as a course.  This is where the rubber hits the road with decision makers.  If we don't have the data to describe the local reality they rely on the trusted people around them to provide insight.  That difference between what our data can do and what relationships enable is where a lot of the politics on the data come in to play. 

 

 

We also have the challenge of understanding much of the indicators that we have created are around monitoring.  How many of them are really around helping us achieve the goals.  How many of them are set up so that we are looking for proxies to understand what is happening, or are we building datasets that enable decisions that lead to investments, change and action.  And those ‑‑ there are different ways to begin discerning those.  I want to go through a couple of them.  The first example is disaster risk management.  If we have a hazard, an area that's earthquake prone, we need to know a bit about the exposure, a lot about the exposure of the assets in that area. 

 

 

Some of those assets will react very differently depending on the strength of that vulnerability.  Only will we have enough data to understand what impact that earthquake will have and how do we engage to make that risk less uncertain, if we are going to be doing investments from multi‑national institutions in to reducing that risk.  To get there this is where humanitarian open street map team and open street map method have come in to play.  We started doing structured data collections around a specific model that enables us to get to information about the buildings that can help us explain that exposure and increase that resolution.  So that we would have enough information to work with. 

 

 

This is usually done through field, either with digital data collection or more and more we are seeing the ability to use paper with scanning and cell phones, pictures of the paper back in to systems that allow us to trace it.  This goes in to open street map in some ways.  We are collecting the data that's public and open and any data that we collect which is about a household which is about something that has personal information is put in to a separate database.  We also in a second area have a problem.  The Government has many or Governments tend to have many datasets which are hard to release.  They are frozen.  Often getting ministries it releases them, is a matter of dealing with the interministry rivalries.  B&B becomes really a critical problem for us.  We need to know baselines and boundaries.  What are the baseline datasets around demographics and health that we can begin to use and how do we get those inside of boundary sets that enable us to get to higher resolution analyses.  And national level is not good enough, admin 1 is often not good enough.  We need to be able to target investments.  We see this in spatial data platforms which have begun to emerge where we see open data portals and ability to put datasets in to one place where we can share it among multiple agencies. 

 

 

Oftentimes this is in the form where we can download it in many different data formats and the license is right on the page.  So you know exactly what the providence of the data is and the license of data is and you can choose the format that you need. 

 

 

Missing maps pulls all this together.  Missing maps is designed to map the most vulnerable to look for places where commercial mapping has not led to an understanding of who lives there and we begin to build up a process for A, mobilizing the private sector to remote map, trace satellite imaginary and then work with the communities themselves to begin mapping those areas and telling us what's the land use, what's the clinic and what's the name of that road.  And then the third step how we as a humanitarian organization to begin using that data to target our investments and programs. 

 

 

One of them, measles, we have to reach a 95% number.  If we are going to eliminate measles in an area we have to reach 95% of children.  That requires us to trace and then mobilize people on the ground to map and then we can do our intervention.  In Malawi the best that we had in terms of population estimates was actually not that bad by standards.  We had pretty good information but still not good enough for us to target where do we send our volunteers, which specific places and how do we mobilize those volunteers to get there.  We worked with Facebook to develop a program to take their disaster maps data and look at check‑ins and begin to get a much higher resolution population map which enables us to begin taking this which is exactly the same map as that. 

 

 

Now we know where to begin to go and begin to target our volunteers.  But we could also overlay the open street map data and this tells us specifically how do we begin to understand which buildings do we check off from those volunteers.  The key here I think in terms of building out this local context it requires local knowledge.  It requires high resolution information, make changes and I think we also are going to have to learn more and more on our data literacy if we want to use these datasets and turn this in to programs that turn in to decisions.  We are going to have to learn like a musician that wants to join a choir, we have to learn to sing and we have to be able to learn to use these datasets in our decision or we won't know it is out of tune or inaccurate. 

 

 

>> NADIA ISLER:  Beautiful.  I love the metaphor at the end.  Thank you, John.  Great presentation.  I think there are new elements that you brought in complimenting also previous speakers but you focused on data for achieving the SDGs.  I think those are two dimensions of ‑‑ that we need to have clear on our radar because sometimes the conversation is a bit kind of mixed in that regard.  You also mentioned the importance of trust and who is collecting the data and how that sort of partnership evolves over time to make sure that the data makes it through to actually policy making decisions because that's inherently what data should be used for.  And then also stressing the importance of data literacy.  And Barbara was saying in her remarks modern goals require modern data collection, but also modern literacy I guess or a new way of understanding data.  And that's a huge challenge that we all have. 

 

 

Thank you very much to all the speakers.  I found it very, very inspiring.  And I hope it has inspired you in the room, too.  The room has slowly filled.  And I see faces from many different stakeholders including Member States represented here, NGOs, people also collecting data at the country level.  I see colleagues from Nigeria, from Tanzania.  There is a diversity of stakeholders in the room and gives us a good opportunity to have great conversation in the next 20 minutes. 

 

 

So I would like to give the floor to all of you, the audience to ask questions to the speakers first.  And it is okay if it is a bit of a random conversation today.  I think the aim is not to solve the data equation in this session but really to be the start of a dialogue.  So I would like to open the floor.  And please those who are actually active at the country level on these matters please do share with us your experiences and your challenges.  That would be great.  Not to say that the others shouldn't speak, but I'm in particular encouraging those voices because it is not often we have you here in Geneva.  The floor is open to ask any questions to the speakers or make any general comments.  If you can identify yourself when you take the floor.

 

 

>> PARTICIPANT:  Thank you for the floor.  I'm Mohammed Lucas from Brazil.  And I am here by the CGI program.  In Brazil we participate in the SDG observatory.  We do our work of monitoring and collecting data at a local level.  And basically my questions are how can we ensure data is quality, especially in Developing Countries that haven't had big challenges in collecting data?  In the case of Flowminder and Red Cross you guys are already collecting data in these countries.  So in this case I would like to know which are the main difficulties you have to do this.  So thank you. 

 

 

>> NADIA ISLER:  Thank you very much.  I will take maybe a few more questions first.  Yes.  Please. 

 

 

>> PARTICIPANT:  Hello.  My name is Sara and I am open data Hong Kong.  I would like to ask the panel about what you see as the biggest hindrance at the moment in getting access to data.  Is it like structural difficulties or political difficulties or technical difficulties?  Because it seems there is a lot of data but somehow relating to the SDGs there is also many cases where we don't collect data in the first place and this is not only true for emerging economies but also for rich countries.  Like in Hong Kong if you want to have a reliable data on climate change factors, sometimes the data is not even collected.  So what do you think, how do we get there to get more consistent datasets across all countries that work on SDGs?  Thank you. 

 

 

>> NADIA ISLER:  Thank you very much for your question.  I will take a third one before I bounce it back to the panel if there is interest from the audience.  If there is not I will give you a chance to think about what you may want to bring up to the panel afterwards.  Two critical questions.  One on the quality of data.  How do you ensure the quality of data if you can, or what are the ways we can mitigate the risks of losing quality of data. 

 

 

And secondly what are the hurdles to accessing data which we may know we need but we can't access or we may not even realize that we need.  So the notion of accessing data and the challenges in that regard.  So two important questions.  Who from the panel would like to take a first go at answering those two key questions?  John.  You have the floor. 

 

 

>> JOHN CROWLEY:  Let me take the data quality problem.  Many of the issues that emerge in the data collection process emerges as a result of not having enough feedback loops.  If you are going to be working with a community as we have with open street map and the collections processes around community mapping, it is rare that the community or the people you train get it right on the first time.  There is a process of beginning this understanding of what it is that you are trying to collect, why is it you are collecting it and then beginning training process and being in a build up the quality.  These feedback loops continue to go higher and higher up the chain.  The ability to be able to look at a dataset, understand its prominence and understand how it was put together, make corrections to it, add to it.  It is not something that we have I think done enough.  We have seen areas where it has been true but more often than not it is official datasets released as if they are the truth or this is the truth in a snapshot in time.  We need to be able to correct them and keep them dynamic so the datasets are continually evolving and that QA process is something or quality assurance process is something that's evolving as a set of techniques but it requires having the data open and editable. 

 

 

>> NADIA ISLER:  Thanks, John.  Very helpful.  Barbara, I think you wanted to react and then Rosy.  You want to react to the access question?  Rosy, did you want to talk to the first question on quality?  I will have Rosy first on quality and then Barbara on access. 

 

 

>> ROSY MONDARDINI:  This is one of the first questions that you always get in Citizen's Science and it really depends from a lot of factors from the projects, but it is true.  In general you shouldn't trust the data that the citizen gives you.  So you should have many different ways to make sure that those are quality data.  And there are papers however that studies as a result of the conclusion is yes, I mean they ‑‑ Citizens can provide the same quality of data that scientists that would do the same operation basically.  And the methodology go from statistical ways of, for instance, analysis of images.  Just to make sure that you have a nice distribution for the right ‑‑ for the correct answer.  But you have many other ones.  You have importance of training.  You can compare, you know, data provided by the citizens with data provided by professionals and kind of calibrate.  You can build a repetition system for the citizens.  So with experience and with participating to these kind of projects they build up a reputation of being serious contributors.  So you kind of trust their contribution more than others.  So there are many, many different ways that you can actually make sure that the data are quality data. 

 

 

>> NADIA ISLER:  Thank you very much.  The notion of trust keeps coming back also.  Linus, you wanted to ‑‑ Barbara on access and then Linus.  And then I will open the floor for a last round of questions. 

 

 

>> BARBARA ROSEN:  Just to quickly react to the access question.  When it comes to the new form of data and Big Data there are three main obstacles.  First even if you have access datasets might be very, very messy.  And it might be very difficult to conduct analysis.  And I think this is one of the things that at least because it is perceived as very complex and that is also already an obstacle to already start thinking about it.  The second is that most of the Big Data is in the hands of companies.  It is in the hands of private sectors and negotiating partnerships that are productive and sustainable might be very difficult.  Although there are more and more partnerships like that happening.  And the last one I think most importantly there needs to be a capacity and awareness and resources on those who collect the data.  Awareness mainly about what kind of data is available.  How can it be used.  And so, you know, there is not just a need for training for those who collect data but also those who need to ‑‑ those decision makers who need to think about how to integrate new data in their strategies. 

 

 

>> NADIA ISLER:  Thanks, Barbara.  I will give the floor to Linus and then to Katharina.  There is questions coming out remotely. 

 

 

>> LINUS BENGTSSON:  Data is not just one thing.  It is a lot of different things.  Looking at the satellite picture and seeing if there is a house there or not.  That's something that is perfectly suited for whoever.  But there, of course, are more complicated procedures.  And so we need to have a procedure that is suggested to the type of data.  And one reason that people don't share the data is also that they don't ‑‑ they are afraid that other people will think that data is too low quality.  And I think this citizen link can be ‑‑ and scientists as citizens could be a great thing that could be more of a voluntary peer review system.  Just as we have peer review of scientific journals we can have a peer review of datasets and what people submit could be rated by the experts and speaks to the feedback loop that John was talking about.  I think these two can actually link together in a nice system. 

 

 

>> NADIA ISLER:  Absolutely.  Thank you very much for your remarks.  Katharina, any questions? 

 

 

>> REMOTE MODERATOR:  Yes.  On the world global data sharing plays a critical role of more and more territories trying to control the flow of data in and out of the regions.  What is the panel's view on emerging data, privacy laws and impact on SDGs? 

 

 

>> NADIA ISLER:  Thank you very much.  So the whole question related to the privacy of data panel.  Reactions to this critical question?  I think we really have the main question coming out, quality access, privacy.  Can I have some reactions on the privacy dimension?  Anybody would like to react to that? 

 

 

>> BARBARA ROSEN:  Yes.  I think as I already mentioned in my opening kind of presentation I think privacy is a very, very important thing to look at, especially if you collect more and more disaggregated data.  Because this data can be used against you as well as for you.  So in that sense I think emerging privacy regulations might be a very needed step to ensure that at least this data is collected responsibly.  Of course, it puts some certain limitations on access and use and sharing of data but I think it is important to keep in mind that this needs to be done responsibly.  Yep. 

 

 

>> NADIA ISLER:  Thank you.  We underscored the trust of emerging companies that are collecting data and the opportunities that those represent, but at the same time if they are not trusted the data may not ‑‑ will be challenged and not used.  Yes, there were was ‑‑ first of all, are there any other panelists that want to take the floor on the privacy issue?  Otherwise I'll hear from the audience.  No requests from the floor.  One, two, three and then I will close ‑‑ one, two, three, four.

 

 

>> PARTICIPANT:  I want to give comments about the privacy issue.  The private sector in China is building some markets of data, data markets for trade.  So the first thing for foreign markets to do is to check if the data flowing to that market is concerned about privacy issues.  So ‑‑ and also they have a procedure of watching the data about privacy.  For example, if you need data about to the market analyzed so that the data provided from this market will not contend with the names or mobile numbers, just some related issues to the market analyzers.  So I think this ‑‑ this is a kind of practice where privacy protection.  Thank you. 

 

 

>> NADIA ISLER:  Thank you very much for that comment on the privacy question.  As I said I will take those three questions here and then give a round ‑‑ a round for the speakers and then we will close.

 

 

>> PARTICIPANT:  Hi.  I am Helena.  I am a medical student from Australia and youth IGF fellow.  I am interested in humanitarian and disaster relief.  Really useful strategy in areas or population and regions, they have limited access for connectivity to the Internet.  What do you propose is the best strategy to collect this quality data in a timely and efficient manner? 

 

 

>> LINUS BENGTSSON:  Yes.  We work mostly on mobile operator data.  These towers and networks, they are relatively robust against physical ‑‑ physical influence, but there are, of course, such problems.  They are resilient in a way if a tower goes down the other tower takes over.  And so this inbuilt resilience in the data that we are using.  But I mean if it is a completely devastating disaster we ‑‑ this is a huge problem.  Like we can't ‑‑ then we can't use the data.  And ‑‑ so some things we need new innovations for. 

 

 

>> NADIA ISLER:  John, maybe from the IFRC you may have some inputs on the humanitarian setting. 

 

 

>> JOHN CROWLEY:  Three things.  Working beyond connectivity has its own challenges and worthy of a deeper conversation.  Let me put up three things.  Use paper but with QR codes and understand what it is that we are collecting.  We use this with technology called field papers which enables us to take the map offline and write on it and bring it back online through a cell phone picture. 

 

 

Second, take the tools offline.  We oftentimes use something called a portable open street map server where we take a portion of a street map and bring it offline and work on a WiFi network that's not connected to the Internet which is a very useful set of tools and bring other data collection tools that enable us to work on and offline, usually requires some kind of technology that solves the synchronization problem with the global database and quick changing local database.  There is a technical problem there that's difficult but is almost solved. 

 

 

>> NADIA ISLER:  Thank you very much.  Last questions.  The gentleman here and then our Danish colleague. 

 

 

>> PARTICIPANT:  My name is Emir.  I am just adding things about the privacy issues.  I think yesterday there was a panel about data protections in human interactions and they have a handbook on data actions and humanitarian actions and just like safeguard how to protect the data and privacy and sensitive data.  But the problem actually is more about how to localize that handbook in to real work that's a bit difficult.  That's my comment.  Thank you. 

 

 

>> NADIA ISLER:  Thank you.  The last question to the panel. 

 

 

>> PARTICIPANT:  Yes.  So my name is Catherine Block.  I am from the Danish Institute of Human Rights.  And it has been working on linking Human Rights and SDGs, working on how to support follow‑up and review and means of implementation of the SDGs from a Human Rights angle as well.  So we are adding to that database also data on universal periodic review of Member States and how they are meeting Human Rights standards, but also in that sense also certain elements at the SDGs. 

 

 

Now my question is more to kind of how this data can be connected to and linked to national statistical agencies and also support in getting Governments to improve their measures around implementing the SDGs.  So how the data that you are providing is actually something that can contribute to highlighting to Governments the need to implement the additional policy measures and provide additional protections, et cetera. 

 

 

>> NADIA ISLER:  I think that's a huge, huge question.  We could spend a whole day on how data actually lands in to policy making and bridge that gap.  Huge question.  Speakers, do you want to give a small answer to a huge question or a few elements of response?  And then I think I won't open ‑‑ great question.  But it is a Pandora's box.  Quick reactions from the panel. 

 

 

>> LINUS BENGTSSON:  It is the most important collaboration partner.  We work with, for example, the Afghan statistical bureau and the Ghana statistical services.  It is ‑‑ but it is very much early work in ‑‑ it is early work in how to do these collaborations in the best way.  Right now we are doing most of the work.  And how to choose what parts of the work can be done by them and what should be done by us and that's under development.  We will see where we get that.  I don't think we want statistical agencies to use computers but they don't want to build computers, but certain things in new data space they should be able to do and what is cost efficient and what is not cost efficient but that's under development. 

 

 

>> NADIA ISLER:  Rosy, you want to react? 

 

 

>> ROSY MONDARDINI:  Yes, I think it is great to change policies at a global level.  And there are a lot of people working on that.  If Citizen's Science can have a role it is mainly to change policies at the local level.  Really start small because the kind of data that citizens can collect can really have an influence at a local level, local administration, community level.  To change habits and to basically lobby for change starting small and then eventually you can get bigger. 

 

 

>> NADIA ISLER:  John and then Barbara and then concluding remarks. 

 

 

>> JOHN CROWLEY:  I will probably put out a more radical quote.  Never change the existing system by fighting it.  You change it by making it obsolete.  In some sense the data that we are putting together on the data regimes is competing with NSOs.  And those national statistical offices are making some changes based on that.  And the adaptation they are doing is important.  NSOs have an important vital role to play, but I think the process of bringing adaptation and change requires a little bit of friction. 

 

 

>> NADIA ISLER:  Barbara.  Thanks, John. 

 

 

>> BARBARA ROSEN:  Thank you.  Yes, the statistical offices actually are already ‑‑ some of them are very aware of the opportunities of Big Data or new forms of data.  And there is conference or Working Groups at the UN level around this.  But in general we see that there is quite some reluctance from statistical offices to get started with Big Data because on one hand quality concerns, they have very, very strict guidelines of what statistics should be like, but also continuity, how do we know in ten years that we can measure the same things with these kind of datasets or from one year to another how are these indicators changing.  Can you compare across time.  I think that's an important challenge.  But in the end I think and it is for Big Data, I think in all kinds of ways in whatever ‑‑ however you use it, you always need to check it with other statistics or other sources of data to really have it ‑‑ it can be a good source of data by itself.  But I think can also be important to check biases and check existing information and compliment things that we are already collecting and compliment ways in which we already monitor things. 

 

 

>> NADIA ISLER:  Thank you very much, Barbara.  Thank you very much to the entire panel and for this conversation which I'm sure you all enjoy but are probably just as frustrated as I am because we want to spend the whole day on this.  But as I said this is really just the beginning of a conversation.  I think the panelists really brought an energized approach to the opportunities of data.  And then in the conversation I think we really looked at the challenges but it is striking that balance between that, tackling those hurdles such as the quality of data, the access to data, the privacy of data.  And then making sure that data is actually used.  And I don't think it is now a session that we can find solutions to that but I do think that a conversation with all the different stakeholders is where we want to go.  It is certainly one of the pillars of the work of the SDG lab is to tackle these very complex issues with all the different stakeholders that have a piece of that puzzle.  And it is particularly important in the framework of the SDGs to make sure that, you know, we look at all the problems through all its different lenses to find solutions together. 

 

 

I think also we need more ‑‑ we need more good practices to show where private and the public can work together hand in hand on these challenges and building that trust so that the right data is accessed but also used.  I think, we think we need many more of those good practices.  And part of our work at the lab is to amplify those good stories to show that there is a way forward in this complex equation. 

 

 

So I would like to thank you all for your presence.  Thank all the speakers also and wish you hopefully a happy holiday and a restful holiday for you all.  And a lovely stay in Geneva.  Thank you.