Sophya
Future of work skills are increasingly important, and technological ‘whiplash’ disproportionally affects people who can't easily access digital skills.
The internet holds the answer to scalable learning, but is super unstructured for learners. Sophya uses data science analysis of learners to structure learning content on the web and to serve this up to the right people at the right time. Hence our 'Spotify' analogy: learners no longer have to go find the right content - the right content will come find them. We have excellent algorithm training because we work with big Universities whose students anonymously train, and we then supply (really pretty!) learning via these algorithms to the general public worldwide, for free.
Any person simply comes to Sophya, tells us their level, and what skills they want to learn, and we assemble dynamically updating Learning Paths to get them there, using the best possible content on the web.
According to the World Economic Forum’s latest report on Future of Jobs, by the mid-2020s, 75 million jobs may be displaced by technology trends, while 133 million new roles may emerge. Separately, as the world’s population continues to increase, brick and mortar educational institutions cannot keep up with global demand for basic education or these new technological skills.
Both problems above disproportionality affects underserved populations. This is partially because underserved communities fill many of the jobs that will be replaced, and partially because these same communities suffer structural disadvantages in pursuing the basic education and further training that the new roles require.
We are tackling the problem of inaccessible education/training that frequently displaces underserved populations and bars this group from equitable prosperity.
Contributing factors: Examples - research indicates that learners sink too much time into searching for the right content, and don’t get any benefit or insight from learners who came before them. In underserved areas and in older retrainers, we particularly found that aspiring and current students were often confused about how to reach their career or retraining goals. (E.g. “I want to be an engineer/programmer/doctor/nurse, but I don’t know how”). We handily solve these problems.
We ultimately aim to serve ‘pre-K to gray’ learners worldwide. We’re starting in the 'Future of Work' areas: health sciences and technical skills (e.g. programming, robotics, data science) for learners/retrainers in or out of school. We chose these starting points because technical and health science skills are sought-after and ‘future-proof’. There is also a shortage of millions of healthcare and technical workers that will worsen - indefinitely - without a scalable system for learning these skills.
How: We’re actively working with close to 100,000 students in these fields to crowdsource identify the best learning content on the web that is demonstrably helpful to them in career or learning progression. I.e. the internet content they actually use to learn, and their outcomes for validation. We’re also working with experts and faculty in these fields to see what is recommended from the ‘top down’, to cross reference with the material that students are organically finding, sharing, and using on their own.
Addressing needs: Using this data, we construct recommendation systems that give similar learners exactly what they need, in the right order that they need it. Simple. No more wasted time, confusing career pathways, or barriers to learning desired skills.
We built a web-based, cross-compatible, mobile-first system for learning that vastly improves how digital learners and re-trainers access and learn from content online (and makes it much more fun and engaging).
We describe what we’re building as ‘the Spotify of Learning’, because instead of learners repeatedly sinking hours into finding and using the right content - in Sophya, the right content instead comes to the learner, at the right level, in the right order to learn best.
To do this, Sophya uses advanced statistical techniques in data science to essentially put an education and retraining spin on the mechanics of Pinterest and Spotify:
Similar to Pinterest: We allow learners and Schools to curate and organize their learning material individually or collaboratively into Groups and Learning Paths. This learning material can either be embedded in, can be uploaded, or can be provided by schools.
Learning and Data Tools: Users are additionally incentivized to pull other web learning content into Sophya because we have gorgeous organization, a learner-driven UI/UX, and study tools such as typed notetaking, patent-pending computer vision-powered video-notetaking, spaced-repetition algorithmic flashcards for retrieval practice, and peer collaboration. We also show the learner robust analytics about their learning, and suggest ways for them to reach their learning outcomes more efficiently.
Algorithms: We work with large numbers of health sciences and technical skills learners in University/College settings, and, while maintaining the highest degree of privacy via differential privacy techniques, we get access to curriculum and grade information. With this input and outcome data in hand, we can then use data science techniques to validate Learning Path and curricular content, sequence, and relevant required background, and, we then use these data to build Recommendation Engines to provide for free to students around the world who can't access this in-school learning. (Yes, we respect IP in the process).
Similar to Spotify: So, when new learners join Sophya, we ask them what level they are, and what they’d like to learn (e.g. I'm in Grade 9, I want to be an engineer, and I care about climate change). We then give them a visual Learning Pathway, and stream them what is likely to be the right content for their level and goals, in the right order and in real-time, as defined by thousands of other learners just like them who have actually been successful in attaining their desired skill or learning outcome.
- Increase opportunities for people - especially those traditionally left behind and most marginalized – to access digital and 21st century skills, meet employer demands, and access the jobs of today and tomorrow
- Upskill, reskill, or retrain workers in the industries most affected by technological transformations
- Growth
Innovation in approach:
We are the first (and only) system that uses group training algorithms applied to digital skills learning using software that students across cultures and continents actually enjoy using. We have thousands of learners enrolled in schools implicitly defining the best methods, order, and content to learn the key Future of Work skills (health sciences, technical/digital skills), we get outcome data, and we train algorithms based on this data to provide world-class learning access and content to any person in the world for free. This literally has never been done before, and is kind of a Robin Hood approach. We work with well-resourced schools and students, train algorithms with the highest levels of privacy protection possible, and then provide benefits to every learner in the world.
We also report the most robust data possible back to each learner, about their learning. Typically what happens when people learn anything is that they get a series of learning objectives at the beginning...and then they're examined in some way at the end. What's happening in the middle? We report all that data back to the individual student for their own benefit. This has never been possible before.
Innovation in tools:
We invented novel computer vision techniques to help learners learn better with video - the most popular way to learn on the planet (and increasing). This is patent-pending.
We invented a spaced-repetition flashcard system that theoretically is superior to the current gold standard. This is being tested now across multiple institutions.
We make world-class digital skills Learning Paths available to everyone in the world, based on algorithm training by people actively learning digital skills. This enables every learner to benefit from learners who came before them (who may be well-resourced), in aggregate. This is similar in principle to 'natural selection' of learning methods and sequences.
Theory of Change:
Thousands of students use Sophya through their Universities, reaching desired outcomes in digital skills and health sciences learning.
Outcome data is collected, and statistical techniques are used to validate or invalidate learning content, methods, and order.
Algorithms are trained that can then be used to provide these learning methods and content to any person around the world, for free.
People in any given community can thus use Sophya to have validated methods and pathways of learning for digital skills acquisition.
These learners worldwide don’t have to be in school or pay tuition. Therefore, underserved communities gain access to upskilling immediately.
Upskill → better work opportunities, or can choose entrepreneurship to create wealth and jobs → change socioeconomic status → reinforce positive cycle for community → feed back into algorithm training by further learning.
In one of our largest early pilots of 1200 students at a world-leading institution, given the enjoyment, data, and efficient learning that students had whilst on Sophya, we immediately expanded to 10x this number of students at that institution, and were recommended to four other institutions by this one, to help more students (they've all since joined).
- Women & Girls
- Pregnant Women
- Children & Adolescents
- Elderly
- Rural Residents
- Urban Residents
- Very Poor
- Low-Income
- Middle-Income
- Minorities/Previously Excluded Populations
- Refugees/Internally Displaced Persons
- Persons with Disabilities
- Australia
- Canada
- China
- Jamaica
- Laos
- Malaysia
- United Kingdom
- United States
- Japan
- Korea South
- Pakistan
- United Arab Emirates
- Hong Kong
- Australia
- Canada
- China
- Jamaica
- Laos
- Malaysia
- United Kingdom
- United States
- Japan
- Korea South
- Pakistan
- United Arab Emirates
- Hong Kong
Currently, we serve ~50,000 students around the world.
Based on current contracts, we expect approximately ~400,000 students onboard within 1 year, and we further project 4 million students onboard within 5 years.
We expect that the number of people we'll be able to directly equip with learned technical and digital skills will be at least an order of magnitude higher than the current systems, because we are combining an approachable user experience, with data from past learners and internet content. This is much more scalable than our current education infrastructure, even taking current internet learning paradigms into consideration.
We divide our goals up into a few buckets: growth, technical, partnerships.
Next year:
Growth: We’re aiming to have 400,000 students/lifelong learners on the platform within the next year.
Technical: Within the next year we aim to have full-scale recommendation systems for healthcare and technical ‘Future of Work’ content (e.g. robotics, programming, data science) for communities, learners in or out of school, and for re-trainers.
Partnerships: We’re aiming to partner with at least 5 more leading public institutions who can help with algorithm training.
Five years:
Overall: Be the go-to platform for people around the world to use whenever they want to learn anything in the most efficient way. Give everyone on the planet the ability to make maximal use of their ‘brain capital’, and not get left behind without digital skills (if they want them).
Growth: We’re aiming to have close to 4 million students/lifelong learners on the platform within 5 years.
Technical: Within five years, we aim to have recommendation systems for K-12 education, several undergraduate programs (including all available STEM education pathways), and much more robust healthcare and ‘Future of Work’ learning pathways.
Partnerships: We’re aiming to partner with at least 50 leading public institutions who can help with algorithm training.
Next year:
Model training: We want to ensure our models for recommendation serve the initial beachhead market of Future of Work/digital skills training.
Distribution: Once product-market fit is reached, our key task is distribution.
Legal: We have to ensure that content being embedded or uploaded isn’t posing any copyright issues. We follow in YouTube/Pinterest’s footsteps (both multi-billion dollar companies, so they clearly have Terms of Services that the law is okay with).
Financial: Have to hit our milestones without running out of cash.
Five years:
Model training: continually improving, but also new skills will become available.
Distribution: Similar as before - just continuing to develop our growth strategy.
Market: The market will evolve as competitors enter and required upskilling changes.
Culture: As we expand geographically - we’ll have to contend with places that don’t see education for women of color or particular class systems as important as for the ‘dominant’ group. We need to break through this (and we will!).
Each can be difficult, but is doable.
Overall, we’ll lean on our resources, which include a diverse and accomplished Advisory Board (Deans, Provosts, Presidents, and C-level executives at MIT, Harvard College, HBS, Stanford, GSV, and Disney), our partners, data, government and non-government organizations, and our ‘smart creative’ teammates.
Model training: We’ll ensure we have a diverse set of users and narrow datasets on which to train our learner-level-specific algorithms. We’ll ensure enough learners are on the platform so that the algorithms are solid.
Distribution: We have contracts with multiple schools with student numbers that total in the hundreds of thousands. We also have social media growth hacks to help with distribution. We will make the software as shareable and fun as possible to encourage distribution by our users.
Legal: Why, lawyers of course! Get creative with building high value to the user without IP issues.
Financial: Ensuring we keep a close eye on burn versus deploying capital smartly for growth.
Market: Ensure we keep an ear to the ground re: market dynamics, and new future of work skills needed.
Culture: Still working this out. But from now we’re building relationships with two major NGOs that could help us enter culturally different markets in a sensitive way.
- My solution is already being implemented in one or more of ServiceNow’s primary markets
We currently primarily work with tens of thousands of students in Australia, Canada, the US, and the UK, as part of institutional contracts.
N/A
- Hybrid of for-profit and nonprofit
N/A
7 full-time employees (CEO/COO and five engineers).
1 part-time employee (data scientist)
3 interns (UX design, market research, and growth).
In the next 6 months, we plan to hire 1 more engineer, and 1 more data scientist.
We have experience in education, finance, growth, programming, data science, research, public health, and medicine. These competencies are strongly aligned with building a sound, scalable system of data-driven education for public good.
Our leadership team is composed of a medical doctor (Vishal) and a PhD student (Emma) at Harvard. Each were top Teaching Fellows at Khan Academy, and have worked on building education/data platforms at the World Health Organization and at Harvard Medical School. Vishal helped scale a prior startup’s userbase from tens of thousands to hundreds of thousands of students. Emma is an insanely talented/genius co-founder who knows how to product manage, fundraise, manage the team, hire, and code. Our CTO Mark has 20 years of software engineering experience and leads the engineering team with care, humor, and great mentorship.
We care about the learner’s experience with the software. When people can personalize, collaborate, or just enjoy interacting with an app, they spend more time with it. We’re user-centered-design focused, and weight user experience on par with desired outcome. We have an in-house design team, front-end specialist, and UX designer. Our short iteration cycles center around testing with students - experience and outcomes.
We’re surrounded by an incredible team of advisors/investors from Harvard, MIT, Google, Stanford, and Carnegie Mellon - and internationally in Hong Kong, Switzerland, and Australia - who span several relevant verticals. We were in Harvard’s top incubator (Launch Lab X), where we were described as 'the heart and soul' of the program. We're doing this.
Sophya is currently partnering with 4 large universities across 2 countries for algorithm training and initial deployments. Three of these universities are using Sophya primarily in their health sciences schools, while one is deploying University-wide. The deployments are focused around improving the schools' student experience at University, providing and training recommendation algorithms for students, and giving students data and insight about their own learning so that they can improve at their own pace.
In the spirit of improving education access, health science faculties in particular are helping to train the ML recommendations on level and quality-appropriate content that should be distributed to all health science learners - both in school and out.
The total number of students who will be able to access the software at these schools alone by end of year is approximately 160,000 - not to mention the others who are onboard to start later this year.
We’re a SaaS business for Institutions (schools and companies), and are expanding rapidly at our market-tested price point.
Institutions pay us to provide aggregated, privacy-aligned insight into their students/employees skills analyses, skill gaps, and assessment metrics, and to provide and suggest methods of and content for learning, retraining, or upskilling. We are GDPR compliant and abstract away any and all personally identifiable information (if there is any there in the first place - it's not required).
Because of the importance of the learning algorithms, we provide these (and our other tools) to students for free.
Schematic:
1. Schools/companies -> pay for Sophya -> provide to students/employees -> all parties get tools and data: the institutions get aggregate data to gain insight into their people and how to target help, while the individuals get their individual data.
2. Individuals around the world -> don't pay for Sophya -> get tools and data for personal use, without aggregation. Any learner, whether it's a rural learner in Sudan or Alabama, can immediately benefit from all learners who came before them. Any person with an internet connection can learn digital skills far easier than before using Sophya, for free.
We have raised investment capital from well-known investors in Boston, San Francisco, NYC, Geneva, Vancouver, and Hong Kong. This has sustained us for the past year, and provides us with good runway from today.
We are bringing in revenue from large institutions from our SaaS model. As long as we continue to deliver a great experience to our learners and institutions, this will increase.
i) Assistance with partnerships: This piece is CRITICAL and working with Solve would be incredibly helpful. Sophya can do best by the learners of the world when the network effect as strong as possible. This happens when lots of institutions jump on and contribute. Their students are demonstrably happier, institutions get aggregated data, and every learner in the world is helped in the process by getting better algorithms (Spotify-like!).
Therefore, Solve could be incredibly helpful by helping us build relationships with educational institutions around the world. We then can much more quickly train algorithms to provide location-helpful educational pathways to any learner on the planet.
ii) Community/network: This is also a very valuable piece. Communities of early-stage entrepreneurs are incredibly powerful because we each deal with similar issues, and can help each other get through them.
iii) Funding: Money is any startup’s oxygen. The grant and access to other funding is great.
iv) Personalized support: Mentorship, a brain trust of advisors, and specific help with PR would all be very helpful.
- Distribution
- Talent or board members
- Monitoring & evaluation
- Media & speaking opportunities
N/A
In general, we’d like to partner with companies or educational institutions (at any level) to help validate and train models to be helpful for local learners who may or may not be in school.
We don’t have specific organizations in mind, though we are over-indexing for partnering with educational institutions in underserved areas. That way we can do two things: i) ensure that local community-members who aren’t able to enroll in those schools can get vetted educational internet content germane to their region, and ii) ensure that we can cross-reference the internet material that is being validated at schools in underserved areas with schools in high economic status areas to look for significant differences in material. We can then recommend content in a more equitable way.

CEO / Resident Physician
COO