Monday, August 17, 2020

Introduction, Part 3: The Technology of One Acre Fund

The Technology of One Acre Fund reflects the philosophy and business practices of the organization, which shouldn't be much of a surprise. Unfortunately, business practices are relatively easy to change and adapt to a new context; Software is much harder to alter and tends to be built in accretion layers. Personally, I believe that the offline system powering One Acre Fund's field operations (Roster) is a key component of the organization's scaling - to really understand it, a story makes more sense than a technical manual.

The Story of Roster

Roster wasn't planned so much as it grew organically. The core business it supports varies widely by geographic area (every OAF country program runs as a startup with near-total control over business practices), but in general consists of:

  1. An intense, short (3-4 weeks) enrollment period to sign up for a season and order products (intent to buy agreements, not legally-enforceable contracts)
  2. A required prepayment amount/percentage by a certain deadline (usually paid in multiple payments over time as clients have cash, not weekly/monthly terms).
  3. A bulk distribution at planting time, utilizing paper management tools, with clients allowed to change their orders on the day of delivery
  4. A bulk distribution at top dress time (second batch of fertilizer used when plants are almost fully grown)
  5. A period to enroll in Top-Up/AddOn orders (additional products)
  6. Multiple additional bulk distributions for Top-Ups/AddOns or Incentives
  7. Flexible Repayment throughout the season (not relying on weekly/monthly payment terms) until the Final Repayment deadline
  8. Analysis of Repayment progress with soft incentives (T-shirts, farm tools, etc.) to encourage a healthy repayment track

Field Tools

For most of its history Roster was basically a way to generate and data-enter those 4 paper forms; over time it's slowly grown to do more things but those are still the core functionality.

  1. Enrollment form (farmers sign up for a "contract" that's not actually a contract, more like an "intent to buy")
  2. Input Distribution Sheet (which client gets which item off a truck)
  3. Truck Management Sheet (which items go onto which truck)
  4. Client Repayment Progress Report (how much of their loan a client has paid back)

Offline Source-of-Truth


The main difference between Roster and almost any other modern software project is that Roster is explicitly designed to work offline; not "partially online (until you logout and it stops working)", it's meant to operate offline for weeks without an internet connection but still sync data when needed. Other tools and apps offer Offline options, but keep the source-of-truth status as the cloud system; Roster's source-of-truth is the Bookkeeper laptop.

Early One Acre Fund (2006-2010)

The earliest prototypes of One Acre Fund software were created by non-technical staff members using spreadsheets and Microsoft Access to provide inputs to dozens of farmers in relatively isolated communities. During the early days of the company, cell networks in East Africa were practically non-existent; if you visit any one of the current company residential compounds, you will almost certainly find a "Programming in Access" textbook mixed in with the various agricultural and leadership reference material. For several years, successive generations of technical people would join the organization, throw out the existing Access work and start over; the business model was changing so quickly that it made more sense to throw out the previous system and start over than to add features. Unfortunately, this approach is difficult to scale; throwing out all previous work is a controversial topic in the tech world.

Early Scaleability (2010 - 2013)

Starting in 2011, developers decided to continue working on the same Access software rather than throw out the existing work. A small team of developers (with some turnover, but never a team larger than 5 people) started iterating heavily on the Access databases built by non-technical field staff. For a period of about 5 years, the development team had the reputation of being able to quickly and easily add new "hack features" to the software to meet the needs of various trials/business processes. This development team made use of a number of tricks and strategies to "supercharge" Access far beyond the intended uses of the system - I'll provide more technical detail in a separate post, but key among these tricks:

  • Backing MSSQL databases with merge replication - merge replication is a powerful-but-complex offline tool that makes it possible to run separate databases on offline laptops, with only occasional syncing to a cloud-hosted server. Every laptop maintains a separate, isolated copy of just its own data; the cloud system keeps the complete/merged data set from all combined systems. The jargon can be a bit confusing, so it helps to remember:

Replication = Data Copying from place to place

Merge = Separate Subsets that later get combined together

  • Obfuscation and security within Access - encrypted database connection strings and security modifications that prevented unauthorized users from viewing the Access code.
  • Offline data centers - using local networks, multiple laptops could be connected as "assistant" machines to one "Bookkeeper" machine. This makes it possible for One Acre Fund to run medium-scale (10s of laptops) data entry centers in areas that don't have internet connections, then physically move the Bookkeeper machine to an area with a connection to sync it every few days.
  • Heavy reliance on stored procedures and user-defined functions - while SQL code is not as sophisticated as node.js or C#, it provides more functionality than Access VBA script. 
  • Sophisticated SQL methods - complex dynamic SQL was used rather than VBA, including:
    • Dynamic SQL statements stored inside a SQL table, distributed by replication and run on application startup to create local keys/indexes
    • Heavy investments in Indexed Views, Query Tuning and Offline Laptop Performance (a problem as many of the laptops were out-of-date and have poor hardware specifications)
    • SQL queries that dynamically write other SQL queries (Indexed Views) and SQL with embedded email notifications - I plan to write a post on these "SQL beyond what you think SQL can do" methods


However, this decision to keep the existing Access code led to a few drawbacks:
  1. Schema design - the database schema was meant for easy printing of field material, not for performance, storage or limitation of redundant information. A whole set of practices (database normalization) exists to help with designing the best possible schemas for performance and storage; these techniques were not used for the Core Business tables of One Acre Fund.
  2. Audit Trail Weaknesses - running code on a local laptop makes it possible for users to alter the time and date settings, bypassing date checks in the code. Strong authentication of users proved quite difficult to enforce at the field level, leading to shared accounts and authentication done by control of the physical laptop.
  3. Tight coupling of infrastructure and business logic - the original Roster system required physical laptops in each geographic location (district), with the idea that each district would run independently and own its own data. The idea of separate districts is used as the "sharding key" - nearly every database table and software function requires a district identifier. This is causing problems now as countries look at "districtless" logic like brick and mortar stores.

At the time, Cell Networks in East Africa were still quite unreliable; early software updates sometimes required sending a person with a flash drive on a multi-hour bus ride across rural Kenya. The Core Program and business model of One Acre Fund were being rapidly iterated in multiple countries, leading to situations where Off-the-Shelf software would have been too hard to adapt in time to meet the business needs.


Early Cloud Systems (2012-2014)


Expanding cell networks across East Africa made it possible to have centrally hosted servers holding the "merged" main database (instead of laptops in East Africa, use a fully-functional website hosted on a cloud provider called Hetzner). An operations website is built using C#/ASP.NET to handle the aggregated data from all the separate data entry laptops. The operations website is much easier to write code for, deploy and manage. However, it doesn't run offline - so core OAF functionalities are still run on the Bookkeeper laptops.

The early operations website was envisioned as a data warehouse to allow analysis by global/high-level country staff, while day-to-day operations would be handled by individual district machines. Some of the first reports were essentially just database dump operations (like the SeasonClients export, now the most commonly used export with 50,000 downloads in 2019).

Later the ability to add bundle configurations, create inputs and perform other high level configurations were added. The original idea was still to only do central configurations and reporting for districts, but all client-level data entry and data manipulation would be under the control of the district.

A few additional tools and business processes were added to this website between 2015-2017:

  • Transactional MSSQL Replication - this technology makes it possible to maintain realtime read-only copies of the Roster database on multiple cloud servers. The jargon for this feature really means:
Replication = Data Copying from place to place

Transactions = Database Events that all happen together 

  • Bulk Uploaders - Uploaders to the operations website make it possible to do bulk data changes directly to a cloud server, rather than separately on a dozen laptops. This represents a major shift away from the District as the owner and source of all data. More and more business functions start being performed centrally and uploaders allow taking advantage of mobile money rather than collecting money in the field. Uploads are optimized for efficiency (pretty common for 100,000+ line uploads to happen multiple times in a day) .Different countries have different network capability, so highly complex logic around what could happen online and offline was built into the ever-expanding Roster system.
  • Operations website additional tools (2015-present): More and more tools are created in the operations website, such as Government location entry, client search functions, specific distribution planning, etc. Tools that have overlap between the operations website and Access frontend are written as SQL stored procedures to avoid duplicate work. 
  • Solar Home Systems (2015-2017): Solar Home System paygo codes (using Angaza and Biolite, two SHS providers) are attached to the operations website. This makes it possible for clients to order solar home systems and pay the balance over time, with a monthly code entered into the device to keep it functional. If no code is entered, the SHS stops working. Included in this logic was the idea of product warranties (even warranties on parts of products) to handle malfunctioning solar lights.
  • Insurance (2015-2018): Insurance payouts were added to the database as part of the OAF model - especially funeral insurance and crop insurance. 
  • Brick-and-Mortar stores + B2B sales (2015-present): Brick-and-Mortar sales are attempted in multiple countries in order to reach new market segments. Some trials have more success than others, with Kenya's Duka project (OAF-managed brick-and-mortar shops) and Rwanda's P-Shop project (OAF partnerships with existing agrodealers) showing promise.

Separation of Finance from Field (2015-Present)


The philosophical balance that I described previously manifests in our core systems - eventually, the need for standard off-the-shelf Western business software rather than homegrown, customized accounting became obvious. While this is a gross oversimplification, the Western-esque parts of One Acre Fund (Finance, Logistics, Global Purchasing) make use of SAP for off-the-shelf capability of running a global-scale business. The uniquely East African parts of the company that require extreme customization continue to be done using the custom Roster software, with aggregated data from both sets of software combined in a data warehouse. This leads to some bizarrely different levels across different parts of the organization - finance software (SAP and custom-built frontends hosted in Azure) are used almost exclusively by staff based in capital cities with strong internet connections while Access Bookkeeper Laptops are still necessary for use in field locations.

Expanding Cell Networks (2015-2020)


Expanding Cell Networks in East Africa make a number of innovations possible:

  • Mobile Money (2015-present): Mobile money is trialed + added in Kenya - this makes it possible for farmers to repay their balance through SMS/USSD menus. Direct safaricom integration as well as two aggregators (Lipisha and Beyonic) start to be used. This provides a much clearer audit trail for incoming payments and much faster payment processing; it is currently expanded to almost all OAF countries.
  • Mobile Enrollment (2016-present): Bookkeeper laptops are designed to facilitate paper contract data entry; at the level of hundreds of thousands of clients, this starts to cause problems (too many paper contracts to enter, even with offline data center tricks). Tablet enrollment is trialed at a scale of ~250 tablets between 2016 and 2018; in 2019 Kenya goes full scale (3000 tablets, one for each FO). Other countries are currently adopting this software; it was originally written as a progressive web app but is now being adapted into a native application.
  • USSD/SMS Menus (2018-present): Telerivet (a third-party SMS/USSD tool) is used to allow clients to check their balance directly + run trials. Later this functionality is expanded into Mobile Money (a cost-savings measure uses Telerivet + Beyonic in combination). During the 2020 COVID pandemic, rapid reaction through USSD/SMS has allowed the organization to enroll more than 700,000 farmers through a USSD menu. 
  • Field Support (2018-present): Essentially a digitized Client Repayment Progress Report. Field Support was a developer-driven project to rethink the OAF backend using a new offline-capable backend technology (CouchBase). Implemented in a different platform than other Roster tools (Linux/Docker/CBase/node.js) as a caching layer between Roster and tablet users. 
  • Axe Access (2019-present): This is a much-needed push to replace all access tools with operations website tools. In several locations (especially Kenya) offline functionality isn't required; in just 10 years previously-offline locations now have high-speed internet connections and cell networks, so hard-to-maintain offline tools aren't required. Offline capability makes it difficult to release changes and debug issues, so online tools are simpler to iterate and can make use of cutting-edge tools unavailable offline. 
  • Roster Evolution (2019-present): This is a transformative vision to replace/refactor Roster system into something new and better. While the logistics of transforming a $100 Million piece of software can be daunting, the next generation of mobile-optimized tools provide exciting potential to more effectively serve a larger number of clients.

Where We Are Now:


As I write this (August 2020), One Acre Fund is undergoing a transition period. We still have Access databases in active use, but are phasing this out in favor of tablets and USSD/SMS-based tools. We're actively transforming our server hosting processes and transitioning from manually-configured Windows servers to Containerized Linux Microservices. While the Coronavirus pandemic has been an international tragedy, it has also forced the organization to adapt far faster than we thought possible, spurring new innovations in One Acre Fund's technology. 

My Part in this Story:


I started at One Acre Fund in January of 2017; at the time we were doing regular Access deployments, expanding Mobile Money, doing early trials of Mobile Enrollment and starting to look at USSD/SMS. My role was formally "Development Operations Engineer", now "Database Administrator" - informally I see my role as "make everyone else's job easier". I basically do support engineering - I don't provide value directly to the business, I provide value to the other software developers so that they can work faster and the team can accomplish more. This is a difficult role for a number of reasons - I need to understand what everyone else is doing in enough detail to fit all the pieces together, fix them when they break, provide production support to deploy them into the real world and help identify/fix scaling issues that occur when you have millions of clients. Throughout my time at One Acre Fund, I've taken on a number of support-type projects:
  1. Reinitialization Cleanup (2018): Reinitialization (the process of refreshing the full database on all offline Bookkeeper laptops) is a cumbersome process involving cross-country coordination; every BK laptop must have an internet connection at the same time. Cleanup of reinitialization code to run automated scripts rather than require manual action used in 2018; this was successful. This was a massive exercise in coordination in 2020 involving 20 cloud servers, 139 offline laptops, 10 countries and two production support personnel (myself and our sysadmin).
  2. Offline Deployments - deploying code updates to an offline system is not an easy task.
  3. Tablet Configuration - I basically ran a factory in Kenya for two weeks to configure 3000 mobile devices
  4. Production-Grade Security (SSL certificates, database obfuscation, VPNs and forensic analysis)
  5. AlwaysOn Clustering - providing immediate redundancy if our servers go offline
  6. Test Servers/Ephemeral Servers - giving other developers a complete, isolated test environment
  7. Bookkeeper Offline Tools - maintenance of the custom frontends that One Acre Fund has built to allow Merge Replicated-laptops to run throughout East Africa

Finally, through a combination of staff turnover and developer specialization, I'm now left as the only  developer with production credentials at One Acre Fund. While other developers understand subsets of the software in more detail than I do (especially specifics of the business logic), I'm the only person at One Acre Fund who is able to understand and make changes to the entire live structure (Access, MSSQL, C#, Angular, node.js, docker, couchbase and soon kubernetes + react). That's a situation I'm actively trying to change by training other developers and writing documents like this one. Along the way, I've acquired strong opinions about the things that One Acre Fund's technology has done well and things that we could definitely stand to improve; this blog is meant to capture both my specific technical experiences and the philosophy that I've acquired during my time here.


No comments:

Post a Comment

Cowboys vs. Soldiers

I classify developers into two main camps: Cowboys (aka Hackers) and Soldiers (aka Tweakers) - cowboys are great for startup environments wh...