Jump to:
The Upanzi Network focuses on creating, testing, innovating, and assisting in implementing digital public goods across the continent. The Upanzi Lab at Carnegie Mellon University Africa is the first and lead laboratory in the network. This lab focuses on tackling the network's goals in the East African region.
The Upanzi Lab at CMU-Africa has a broad range of research projects in the areas of identity, payments, cybersecurity, cloud computing, data governance, artificial intelligence and machine learning, and influencing technology policy recommendations to support low- and middle-income countries.
Research at the Upanzi Lab at CMU-Africa
Cybersecurity
Vulnerability assessment and penetration testing of critical infrastructure and applications (VAPT)
Cybersecurity is important for all digital services, but particularly crucial for digital public infrastructure (DPI). Cybersecurity exploits can damage public trust in both the exploited infrastructure, as well as the organizations deploying it. In this project, our teams have run various analyses on DPI including ID systems, financial interoperability systems, and financial mobile applications. Our focus has been on applications that are either originating in Africa or deployed in Africa.
Recent projects:
- Analysis of 18 financial applications using the OWASP MASVSv2.0
- Secret key analysis of 224 top applications used by African mobile users
- Analysis of digital public goods (DPG) like MOSIP, and Mojaloop
- Collaboration with security vendors like Thales
Publications and articles:
Academic Security Operations Center
Today, there are few tools or environments for students to learn how to set up and operate a Security Operations Center (SOC)—not just in Africa, but globally. This project established a fully-functional SOC within the Upanzi Network Lab, to serve as both a training platform for students in the network, and as a tool for identifying vulnerabilities in the DPI that is studied by our lab. This SOC comprehensively monitors all lab systems, including critical DPGs and their interconnected infrastructure, including MOSIP, SMISHING, DHIS2, OPENDATA, and MOJALOOP. Leveraging a suite of open-source tools, the SOC maintains continuous monitoring over endpoint and network events, proactively addressing emergent cyber threats like malware and ransomware. The SOC additionally fosters a dynamic learning environment for CMU-Africa students. Through hands-on collaboration, students are provided with a unique platform to enhance their cybersecurity skills, gaining practical insights and experiences.
Server-side request forgery mitigation using artificial intelligence
Server-side request forgery (SSRF) is a type of attack in which an application server can be made to access or alter data that would otherwise not be accessible to the attacker. SSRF presents significant risks to the integrity and security of organizations. This project has conducted an empirical analysis of deep-learning-based techniques for detecting (and hence, mitigating) SSRF attacks. The study evaluates the performance of two distinct deep learning models, namely Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers (BERT), using a dataset of URLs from the CIC repository. In addition to evaluating existing techniques, we suggest exploring alternative deep learning models, considering hybrid approaches, and evaluating the adaptability and effectiveness of these models in diverse network environments. These lighter-weight approaches will be helpful for SSRF mitigation in low-resource environments.
Smishing data collection and detection in low resource languages
Smishing, a form of SMS phishing, is on the rise in Africa due to the widespread use of mobile phones. This malicious tactic involves deceiving individuals into divulging sensitive information or clicking on harmful links through SMS messages. Existing smishing detection methods are primarily designed for extensive languages like English, leaving a gap for local dialects in Africa. To address this, our study aims to collect SMS datasets tailored to local languages using a multifaceted approach. This includes establishing a honeynet infrastructure, implementing an SMS submission web portal, and collaborating with national-level Mobile Network Operators (MNOs) for SMS submissions via USSD codes.
The collected datasets will be used to train a machine learning model for detecting smishing messages on mobile devices, incorporating URL link analysis to enhance its efficacy. By leveraging these localized datasets and advanced detection techniques, our goal is to fortify mobile users in Africa against the increasing threat of smishing attacks, thereby enhancing overall cybersecurity and putting together the first smishing dataset in English and African languages.
picoCTF-Africa
Today, there are limited resources for African high school and college students to learn about cybersecurity. picoCTF-Africa is a free computer security competition for undergraduate and graduate students across the African continent, created by security and privacy experts at Carnegie Mellon University Africa. This competition is part of picoCTF, the annual computer security competition and learning platform created by CyLab Security and Privacy Institute. Participants of picoCTF-Africa are ranked on an African-only leaderboard within the annual picoCTF competition. The competition runs every year in March, and attracts over 1,000 students a year. Our team also runs training programs in individual schools to help prepare students.
African-themed cybersecurity comics for educating pre-teens about online safety
Today, there are relatively few resources for children to learn about online risks and techniques for protecting themselves. We are developing a comic book that introduces cybersecurity concepts to children. The comic book promotes awareness of various cyber threats, such as malware, phishing, hacking, and identity theft, teaching children about potential dangers and empowering them to safeguard their devices and personal information. The comic book also imparts best practices for cybersecurity and responsible online behavior, educating children on creating strong passwords, avoiding public Wi-Fi networks, and keeping software updated to enhance their defense against cyber-attacks. The book facilitates the development of critical thinking skills by presenting real-life cyber threat scenarios and prompting children to contemplate appropriate responses. Lastly, by infusing cybersecurity concepts into an entertaining narrative, the comic book contributes to the promotion of STEM education, inspiring children’s interest in science, technology, engineering, and mathematics and encouraging potential careers in these fields.
The Upanzi Network and CyLab-Africa
Cybersecurity research in the Upanzi Network is done in partnership with CyLab-Africa.
DPG/DPI governance and deployment
Privacy expectations and realities of the digital public goods standard
Digital public goods (DPGs) are increasingly being used to deliver government services around the world (e.g., ID management, healthcare registration). Because DPGs may handle sensitive data, the UN has established user privacy as a first-order requirement for DPGs. The privacy risks of DPGs are currently managed in part by the DPG standard, which includes a prerequisite questionnaire with questions designed to evaluate a DPG's privacy posture. In this project, we examine the effectiveness of the DPG standard (as of 2023) for ensuring adequate privacy protections. We present a systematic assessment of responses from DPGs regarding their protections of users' privacy.
Deployability and security of MOSIP
MOSIP is an open-source foundational digital ID system that allows users to enroll and get authenticated using their biometric credentials. While MOSIP as a DPG has the potential to deliver value to the digitalization of Africa, very few countries have successfully implemented it. Experienced engineering teams cannot foresee the challenges that will be faced by adopters when using the system.
In this project, we:
- Publish deployment guides and a security report of the latest LTS version (1.2)
- Contribute security and integration toolkits to lower the technical barrier for 3rd parties
Microservice debugging (with OpenCRVS)
OpenCRVS is an open-source digital civil registration system. Today, OpenCRVS implementers face challenges in gaining visibility into the status of services and servers, making it difficult to understand causes of application failures. To address this issue, the Upanzi Network is collaborating with the OpenCRVS team on a dashboard that provides real-time insights into the state of servers and services within the OpenCRVS system. This initiative aims to enhance monitoring capabilities and facilitate a better understanding of the application's performance.
Financial interoperability using MOJALOOP & use cases
Mojaloop is an open-source software that can be used by organizations to build interoperable, digital payment systems that enable seamless, affordable financial services between individual users, banks, government entities, merchants, mobile network operators, providers, and technology companies — connecting the underserved with the emerging digital economy. We are building use cases around Mojaloop to see how it can be adopted in low resourced environments such as Africa and what it would take for Mojaloop to be used in a real-life scenario. The key objectives of this research include assessing the effectiveness of Mojaloop in reducing transaction costs, promoting cross-border interoperability, and promoting the inclusion of underserved communities into well-established financial frameworks. The goal of this research is to provide practical insights for policymakers, financial institutions, and technology developers committed to advancing financial inclusion in Africa.
Interoperability of digital public goods
Digital public goods (DPGs) are emerging in various sectors and for diverse user groups, yet there's a notable lack of interaction among them. This siloed development approach poses a risk of underutilizing the potential benefits of interoperability. Our project addresses this issue by focusing on the creation of a service bus designed to facilitate seamless communication among digital public goods, minimizing the duplication of efforts required for interoperability by DPG owners.
As a practical application, we've developed middleware that fosters interoperability, with a specific emphasis on authenticating DHIS users through MOSIP as a centralized identity source. MOSIP, an open-source foundational digital ID system, enables users to enroll and authenticate using biometric credentials. Moving forward, our objective is to extend this integration to include other key DPGs such as OpenCRVS and use cases to ease operations within Carnegie Mellon University Africa.
Additionally, our project aims to establish a reference framework that serves as a guide for building DPGs, ensuring interoperability and reducing duplication of efforts. By providing this framework, we aspire to enhance collaboration and efficiency in the development of digital public goods across various sectors and user groups.
Publications and articles
Digital Public Goods Interoperability: A Low-Code Middleware Approach
Public health and agriculture
Livestock biometrics for identification and tracking
The ability to identify and track individual livestock through digital systems is important for improved disease surveillance and import/export regulations compliance. Linking such a system with national ID systems can also aid rural, poor, and marginalized communities to prove ownership of livestock and use them as collateral (movable assets) for credit from formal financial institutions.
Microchipping is the dominant means of digitally identifying livestock. However, microchips can be damaged or removed, and they require close contact with the animal - which can be dangerous in the event of disease outbreaks. Using animals' own biometrics for identification can resolve these challenges. We are investigating the performance (accuracy, speed, cost, scalability) of different neural network architectures for verifying the identity of cattle based on muzzle print images of varying quality (resolution, image angle, lighting) levels.
Publications and articles
A machine learning model for malaria screening
Malaria remains a significant global health challenge, particularly in regions with limited access to quality healthcare services. The burden of manual microscopy-based malaria diagnosis is immense, as it is time-consuming, prone to human error, and requires trained personnel. This project studies how to automate the malaria diagnosis process through advanced technology. Using publicly-available datasets of malaria parasite images, we train a deep learning classification model for classifying or identifying which parasite is present in the image. This can reduce diagnostic turnaround time, allowing for timely treatment initiation and reduced risk of severe complications. We have additionally built a malaria dashboard to visualize the data.
Publications and articles
Data
Private population analytics
Digital ID systems often expose analytics to third-party agencies for tasks such as resource allocation. Such analytics can raise privacy concerns, particularly if third parties are allowed to submit arbitrary queries. Differential privacy (DP) is one technique for protecting the release of hierarchical, tabular population data, such as ID registry data. A common approach for implementing DP in this setting is to release noisy responses to a predefined set of queries. Such methods cannot answer queries for which they were not optimized. An appealing alternative is to generate DP synthetic data, which is drawn from some generating distribution. This study conducts a head-to-head comparison between the Top-Down algorithm and private synthetic data generation to determine how accuracy is affected by query complexity, in-distribution vs. out-of-distribution queries, and privacy guarantees. Our results show that for in-distribution queries, the Top-Down algorithm achieves significantly better privacy-fidelity tradeoffs than any of the synthetic data methods we evaluated; for instance, in our experiments, Top-Down achieved at least 20× lower error on counting queries than the leading synthetic data method at the same privacy budget.
Maddi, A., Routray, S., Goldberg, A., & Fanti, G. (2024). Benchmarking Private Population Data Release Mechanisms: Synthetic Data vs. TopDown. PPAI Workshop
An Africa-centric data portal with automated data labeling and captioning
As African economies evolve to knowledge-based and data-driven economies, we anticipate the need for an open, easily searchable platform to organize data describing African people, landmarks, and phenomena. This data can be shared and accessed by governments and researchers. We include data topics including (but not limited to) mobility, climate, public health, and finance. We have built such an open data portal that is free to access, currently featuring over 3,000 links to African datasets. We are additionally developing the ability to automatically tag and caption datasets using natural language processing techniques. For more information, or to contribute a dataset, see the link below.
To enhance user experience with our open data portal, we have undertaken two follow up initiatives: Automated Dataset Description and Tagging and Data Quality Assessment Framework.
The primary objective of the automated dataset description and tagging is to enhance data search, discoverability, reusability, and interpretation, fostering an enriched and user-friendly data ecosystem. Our proposed solution involves integrating an automated system, leveraging machine learning models, into the portal. This system aims to generate meaningful dataset descriptions and recommend relevant tags during the data upload process.
To ensure the quality of the data in our portal and contribute to universal data quality standards and practices, we are developing a data quality assessment framework. The proposed framework aims to establish standardized dimensions, metrics, and processes for measuring and validating data quality across various sectors. It will be implemented as a tool that scores datasets and provides insights for enhancing their quality.
Open Data Portal: A repository and portal for African datasets
Health data portability
In most countries, patient medical records are confined in siloed systems owned by hospitals, clinics, and health centres, resulting in a fragmented view of patient health histories. Patients cannot access their health records at their convenience, limiting their ability to take ownership and aid cross-border professionals to make healthcare decisions. This project aims to to decouple medical information from a particular vendor or provider and facilitate cross-border treatment by giving patients ownership and access to their medical records. We are currently building a proof-of-concept showcasing how data from various healthcare providers can not only be exchanged, but viewed and updated maintaining a longitudinal view of a patient’s health records.
Data reconciliation (with the eGovernments Foundation)
The eGovernments Foundation, in collaboration with the Upanzi Network, is actively developing a module slated for integration into the DIGIT platform. This module is specifically designed to address and alleviate data persistence issues that may arise between the business module and the indexer service, ensuring seamless and efficient data management. A group of CyLab-Africa engineers are working with DIGIT engineers on a microservice that can reconcile the data as it flows from Kafka to the business module and indexer service.
Connectivity
Rural connectivity for resource-constrained settings
There is still a huge digital divide between people living in rural areas and those living in urban areas. Due to reduced technological infrastructure, underserved communities may be excluded from various services offered by the government and other institutions. This project aims to deploy an opportunistic connectivity network that could provide communication opportunities to areas that are under- served by operators or areas with no existing communication infrastructure. We aim to ensure that this can be delivered at a low cost and requires little or no downtime to limit the frequent interventions of service providers.
Measuring internet resilience in Africa
The internet has become an integral part of our lives and livelihoods. Ensuring that all the components work together to make the internet function with minimal disruption is critical. One of these components is the domain name system (DNS), whose resilience has been threatened by increasing internet attacks. The goal of this project is to quantify the resilience of DNS attacks in Africa and to provide solutions to strengthen its infrastructure. In achieving this goal, we will also examine the hosting and reliability of African country code top-level domain (ccTLDs) and all global DNS services with a presence in Africa.
Technology and society
Towards responsible innovation for digital public goods: an assessment framework
Responsible Research and Innovation (RRI) has received significant attention in the European Commission (EC) and it emphasizes social interaction across the whole innovation process to promote inclusive and sustainable research and innovation. Despite the increasing interest in the Global North, its adoption in sub-Saharan Africa has been slow. This slow uptake can be attributed to existing frameworks that are not informed by the realities and context in Sub-Saharan Africa. The project’s main goal is to use responsible research and innovation (RRI) as a tool and assessment framework for determining how well research and innovation projects follow or incorporate RRI principles such as ethics, security and privacy, sustainability, and societal benefit, among others. It is predicated on the idea, inspired by the privacy by design approach in cybersecurity, that projects should follow a responsible innovation by design (RID) approach from the get-go. Therefore, to support researchers and innovators in assessing their work, this project adopts a dual approach: a) developing a new framework of RRI principles for assessing research and innovation projects that is informed by the realities and unique context of Sub-Saharan Africa, and b) creating a web-based tool implementing the framework to be used to assess project compliance. It will initially focus on projects carried out within CyLab-Africa.
Understanding user practices and preferences in the East African mobile money ecosystem
With the unprecedented rise in adoption and centrality of digital financial systems as a driver for financial inclusion in Africa, it is critical to understand users’ experiences and evaluate opportunities and challenges with such systems. To this end, we have been investigating the challenges and opportunities related to the use of mobile money in Africa. Specifically, we were interested in the reasons for the prevalence of third party SIM cards (SIM cards registered under another person's name), and the user-agent interactions and their implications including privacy and security. We conducted user studies in Kenya, Tanzania and Rwanda. In addition to disseminating the findings of these studies in appropriate venues, we intend to engage relevant stakeholders to share insights and recommendations for improving the mobile money ecosystem. We are currently conducting a second phase of the study to address some of the challenges that the first set of user studies surfaced. In this second arm of research we will address the privacy and security concerns of users as they relate to mobile money transactions. The outcomes and recommendations will provide valuable insights to protecting user data and privacy in the various mobile money transactions
Sowon, K., Luhanga, E., Cranor, L.F., Fanti, G., Tucker, C., & Gueye, A. (2024). The Role of User-Agent Interactions on Mobile Money Practices in Kenya and Tanzania. IEEE S&P 2024
Luhanga, E., Sowon, K., Cranor, L. F., Fanti, G., Tucker, C., & Gueye, A. (2023). User Experiences with Third-Party SIM Cards and ID Registration in Kenya and Tanzania.