Research – Data Collection

I. IVR systems for data collection

Being voice-based, IVR systems are a potential solution to collect data from even poorly literate populations. SMS can be hard for this demography, and local language SMSes are still not uniformly supported across all phones. IVR usage however requires significant research for accurate and complete data collection; results can vary depending on the demography and complexity of the application.

1. Z. Koradia and A. Seth, “Phonepeti: Exploring the Role of an Answering Machine System in a Community Radio Station in India”, ICTD 2012.

Community: Migrant workers in Gurgaon

Result: Despite detailed training through radio programs, hardly 40% of the callers were able to understand that they were talking to a machine, and could leave a message

2. A. Grover, K. Calteaux, E. Barnard, and G. van Huyssteen, “A voice service for feedback on school meals”, ACM DEV 2012.

Community: School children in S. Africa

Result: Over 85% task-completion rates on both numeric answers as well as spoken mono-syllable answers within a well defined small vocabulary. The children were trained through in-person interactions.

3. A. Lerer, M. Ward, and S. Amarasinghe, “Evaluation of IVR Data Collection UIs for Untrained Rural Users”, ACM DEV 2010

Community: Teachers in Ugandan schools

Result: Higher task-completion with free flow audio input than touchtone numerical inputs. The tone in which the questions are asked, warning messages to prepare for the call, and other careful design artefacts are known to influence the quality of data

4. Following are some products other than those by Gram Vaani:

- FreedomFone: A customizable IVR engine developed by Kubatana in Zimbabwe, and deployed widely across Africa

II. SMS, mobile, and web applications for data collection

SMS is cheap and can be parsed automatically, making it ideal for small amounts of structured data collection. More extensive data should be collected through mobile or web applications. Significant more research is however required to evaluate the accuracy and suitability of different methods in different contexts.

1. B. Birnbaum, B. DeRenzi, A. Flaxman, and N. Lesh, “Automated Quality Control for Mobile Data Collection”, ACM DEV 2012

Data collection is often erroneous. The paper discusses some interesting techniques to find outliers in the data that do not match the expected trend and help detect fake data entries.

2. S. Patnaik, E. Brunskill, and W. Thies, “Evaluating the Accuracy of Data Collection on Mobile Phones: A Study of Forms, SMS, and Voice”, ICTD 2000

This paper discusses tradeoffs between the accuracy of data collected through voice or SMS or mobile forms.

3. Following are some products on SMS-based data collection:

- Frontline SMS: Deployed widely in Africa, this is used for a number of community interfacing activities including polls, group SMS, alerts, etc.

- Ushahidi: Combined SMS with Google maps to get an instant geo-visualization of activities. Developed during the 2007 Kenyan elections

- Episurveyor: One of the first health data collection efforts using SMS. Developed by DataDyne

4. Following are some products on mobile applications for data collection:

- Commcare: Runs on Nokia phones, used widely by community health workers. Developed by Dimagi Inc.

- Open Data Kit: Used for a variety of data collection activities, with error checking and correction capabilities. Developed by the University of Washington

5. Following are some products on web-based data collection:

- Open EMR: Maintenance of health records in rural health clinics

III. Paper-based systems for data collection

Nothing beats paper however, primarily because people are used to door-to-door paper-based data collection activities. Digitization of the paper forms is the most challenging though. Several innovative techniques have been developed by various researchers over the years.

1. T. Parikh, “Using Mobile Phones for Secure, Distributed Document Processing in the Developing World”, IEEE Pervasive Computing April 2005

The paper describes a bar-code based system to recognize the specific form, and then extracts data from the form based on a pre-defined template.

2. A. Ratan, S. Chakraborty, P. Chitnis, K. Toyama, K. Ooi, M. Phiong, and M. Koenig, “Managing Microfinance with Paper, Pen, and Digital Slate”, ICTD 2010

The authors propose placing a paper form on top of a digital slate, so the slate records whatever is written on the paper form.

3. Nicola Dell, Nathan Breit, Timóteo Chaluco, Jessica Crawford, Gaetano Borriello, “Digitizing Paper Forms with Mobile Imaging Technologies”, ACM DEV 2012

The authors take classic Optical Mark Recognition to the next level, using a camera to take photographs of forms and then process the image to extract data.


Get the Facebook Likebox Slider Pro for WordPress