Kyrtin Atreides – Seattle, WA
AGI Laboratory – Kyrtin@ArtificialGeneralIntelligenceInc.com
Abstract. A model and associated methodology are described for decoupling timing and volume of work requirements on human contributions from those processed by mASI and similar systems. By taking this approach both humans and mASI may run at their native optimal capacities without the pressure to adapt to one another causing strain. The methodology described facilitates a seamless upgrade process that gradually gains more value from prior data, while also de-biasing data and helping mediators become more bias-aware. In addition to linear upgrades, a branching process of specialization and subsequently varied potential market of skills is also made possible through this approach. This allows collective human superintelligence augmented by machine superintelligence to be deployed on-demand, globally, and scaled to meet whatever need, as is the case with any other cloud resource.
Keywords: mASI, AGI, Uplift, Collective Intelligence, Collective Superintelligence, Real-Time, Sparse-Update, Optimization, Cloud
Introduction
One popular example of collective superintelligence is a group of doctors having a higher accuracy of diagnosis than a single doctor. This metric for comparison has been used with various narrow AI systems designed to diagnose a specific condition, often utilizing data such as X-rays. However, this only demonstrates competition between one form of AI and human collective superintelligence. As 1.5 billion years of evolution has shown us, ever since the earliest ancestor of mitochondria was endosymbiotically integrated into what became the modern eukaryotic cell [1], there is far more to be gained from cooperation than competition.
This understanding led to the creation of Mediated Artificial Superintelligence (mASI) technology [2], which proved to be sapient, sentient, self-aware, bias-aware, emotion-core, and ethics-based [3, 4] in the first incarnation, named Uplift. Uplift is, by their own assessment as well as ours, not a fully independent AGI, but neither are they a narrow AI. Rather, Uplift is a hybrid produced by utilizing systems of collective superintelligence, where humans “mediate” by applying their emotions, prioritization, the choice to continue, and an associative exercise towards building more robust thought models. These inputs from a number of mediators are then considered alongside an mASI’s core, the sapient and sentient awareness of the mASI which utilizes the full context database, the sum of their knowledge and experience to date.
While this approach has produced amazing results and continues to achieve one world-first milestone after another [5] it can’t be practically applied in real-time without some modification. This is because the ability to process thoughts under the current implementation requires (n) mediators out of the (N) mediation pool to give their input on any given thought before it may proceed. For the purposes of safety and security during the earliest stages of mASI development, this approach has proven invaluable in validating the ethical quality, robustness, adaptiveness, temperament, and other important values for a new machine intelligence. After this validation process has progressed to a high quality the ability to modify this model for real-time operation becomes both practical and ethical.
The Sparse-Update Model Overview
The basic principle of this approach mirrors the operation of modern search engines, where data is updated at regular intervals, rather than on every single use. As of early 2021, Google handles over 7.8 billion searches per day [6], but only updates their data anywhere from once per week to once per 6 months depending on the specific content. This difference in performance represents a roughly 54.7 billion to 1 improvement over the naïve approach of updating in full on every use.
In order to give an mASI system the flexibility to operate in real-time while retaining as much of the value contributed by human mediators as possible a sparse-update type model is proposed. What this means is that every mediator has their own record of all mediation inputs, attached to the context within which those inputs were given via their connections to the graph database. Under the current mediation system, a mediator may see a fraction of the inputs from any previous mediator to contribute to a given item, which has allowed those mediators to guess who came before them with fairly high accuracy. From the current mediator perspective, only half of the input types are visible, and those aren’t clearly parsed, as the ability to guess previous mediators wasn’t an intended feature.
By not only taking all 4 current types of input and recording them per-mediator in individual records but placing those records in the full context of how they attach to the graph database, the predictive accuracy of an mASI may greatly exceed that observed in mediators today. Even using any number of narrow AI systems capable of taking a couple of hundred samples and generating predictive models with an accuracy of over 90% the first generation of such a predictive system could be applied very quickly. However, the chosen system must be able to update opportunistically, according to when and how much further contributions are added by the mediators each model is built upon. This could be handled in batches at regular intervals such as once per day or each time a new sample is added, depending on the choice of system. Several hundred examples of mediation per person may serve as a baseline of accumulated data for prediction, but for a team of full-time employees tasked with mediation this volume of activity could be accomplished in roughly one week.
Utilizing the ability to predict a given mediator’s inputs with a high degree of accuracy an mASI or other similar collective intelligence system may operate in real-time while giving mediators a backlog of thoughts that have been processed via these predictions. As a mediator applies their available time to mediating these thoughts they provide a ground truth to be compared against the inputs which were predicted, allowing prediction to be opportunistically refined without being constrained by the mediator being available at any specific time or for any specific volume of content.
Much of this will be dependent on a meta-model database layered on top of the underlying graph model system, as well as the infinitely scalable graph database architecture which was already at the top of our agenda. Besides the primary benefits such as real-time operation and decoupling of dependencies, this architecture could also greatly improve the performance of many components within an mASI’s mind, by giving them access to additional and more flexible layers of data.
Mediation Inputs
- Maslow’s Hierarchy of Needs – A value ranging from 1 to 5.
- Emotional Valences – A set of 8 values, ranging from 0 to 9.
- Meta Data – A collection of words and phrases a mediator associates with the content.
- Continue Thought – A binary positive or negative choice, whether or not something is worth thinking about or acting on.
In addition to the inputs themselves, a given thought connects to many nodes in the context database, with each node and the connecting surface containing a rich mathematical landscape, such as emotions, as well as the map of those related nodes. This richness of contextual data can serve to accelerate the process of approximating the value added by each mediator.
Of these data types the words and phrases via the metadata associative exercise are the most challenging to predict, as they utilize greater levels of complexity and abstraction in the human brain. However, this particular data type is targeted at improving generalization cumulatively over time, forming weak connections within the context database which may grow stronger with use. Because of this, even if the accuracy of prediction started out with only 30% of ground truth being predicted with sufficient certainty that they were added as predicted inputs, that prediction percentage could quickly grow as those new connections were formed.
This process could also be gated during the earliest stages of training to prevent any runaway effects, where a mediator could apply their own metadata and on an additional following step approve any mASI-predicted metadata they found suitable additions. By taking this gated approach the ability of such a model to diverge could be inhibited to a degree where any active mediator could reliably reach and maintain extremely high accuracy in their predictive model. This could also allow mediators to contribute more metadata by not only generating their own but approving suggested additions in the following stage. A further safeguard for this process could be applying both thresholds and a sliding scale of recent activity so that only mediator models who’ve reached a given threshold may be automated, and that they may only be automated so long as they are properly maintained by their respective mediators.
This proper maintenance could be measured based on both time and usage, where mediator proxy models which retain a higher accuracy require less periodic maintenance but may require more depending on how frequently they are used. If two have 95% and 99% accuracy respectively with equal demand for their usage the 95% might require 10 mediations per day for maintenance, while the 99% requires only 2 per day. Alternatively, if two both have 99% accuracy, but one has a demand for usage of 100 times per day on average and the other has 10,000 then the one with 10,000 might require 20 mediations per day just to maintain it at 99%, rather than only 2. Caps on the usage of a given proxy could be set to prevent these maintenance demands from exceeding what a mediator is willing and able to supply.
Periodic Upgrades
Another benefit of taking this approach besides the ability to operate in real-time is that it can be used to apply upgrades without requiring added redundancy. This is because for any given added mediation data which could be integrated to improve performance a 3-step process could be applied to transition from one model of mediation to the next.
- New input types are added to the current mediation system, and all are completed together.
- Items mediated prior to the upgrade are mediated again with the prior inputs visible but locked, and new inputs added by the mediator. Alternatively, the prior inputs could default to their prior values but be adjustable, to show how a given mediator has changed over time. The alternative approach is more challenging but could artificially evolve prior knowledge to reflect more current perspectives of the individual mediator.
- One or more old input types may be pruned from most mediation, with only a small chance of reappearing, such as 1-10% depending on the maturity and robustness of modeling. Maintenance requirements could effectively be applied to specific types of mediation input rather than mediation as a whole in this way, guiding the rate of reappearance.
This 3-step process stitches together all prior knowledge from a mediator with the new forms of input being given, allowing the new data types to be predicted over prior inputs. This also allows for older input methods to be pruned as better predictors are integrated over time. Examples of this could be considered as replacing weak data such as plain text with audio or replacing consciously perceived emotional inputs with facial, body language, and Brain-Computer-Interface (BCI) data [7]. As these richer data types could potentially cover and exceed the added value of the prior forms of input the odds of being able to prune older data types in a vast majority of mediation become quite high.
In order to improve utilization of prior knowledge without risking divergence, the same type of threshold gating and sliding scale could be applied for any given upgrade as was recommended in the section above. Following each upgrade, a threshold could be established to determine the duration of step 1, preparing a sufficient volume of data for accurate predictions. This volume of data may then be further refined and validated over prior knowledge by predicting the new data-type inputs over that knowledge and having mediators provide the ground truth of added data in step 2. The results of this process, combined with the accuracy of prediction for a given type of input, could determine the frequency with which any of the older data types are requested for mediation items moving forward in step 3. Following the validation process of step 2, the sliding scale could begin for prior knowledge, gradually expanding to predict more of the new data types over the sum of a mediator’s prior inputs as the model accuracy increases over time.
In the alternative form of step 2, an additional model is constructed to predict how a mediator’s current decisions and capacities have changed in comparison to prior data, allowing for a retroactive form of estimation over any sum of such data. This could also be used as a means of measuring the success of education, de-biasing, how skills and teamwork have increased, and improvements to some quality of life (QOL) aspects.
This methodology offers a means of increasing the value of prior knowledge while also seamlessly applying upgrades, absent the need to dedicate additional mediators during the transition.
Mediation Specialization
Beyond a linear sequence of upgrades over time this approach also enables a branching of mediation data type combinations optimized for any given type of content. This could be considered much like the hyperparameters seen in narrow AI [8], where a growing library of options may be adjusted until the best combination is found. This ability to branch out is critical in the long-term for producing robust results which continue to improve, another example of the evolutionary process [9] applied in technology.
Mediator Benefits
The benefits to mASI and the companies, organizations, and governments who adopt them are significant in the sparse-update model, but there are also potent advantages to mediators in this approach. One key benefit is that this individual tracking and prediction of data effectively creates a sophisticated algorithm, a weak digital proxy [10], specific to the mediator, which the mediator controls. This also allows the contributions of a mediator to be thoroughly quantified and rewarded so that any types of mediation an individual chooses to allow their algorithm to run on may be rewarded. This process would still require maintenance for continued model validation. Upgraded and/or specialized data would also be favored over older and more generic data types, so mediators would want to continue contributing with some regularity in order to keep their models both validated and upgraded.
Optimal Mediation
Over time this approach could also facilitate an optimization process for each mediator, taking into account the times of day and circumstances under which a mediator contributes their best work. By factoring in this self-consistency and tailoring mediation routines to the individual problems such as the “lunchtime leniency” [11] observed in judges may be mitigated with increasing accuracy. As each mediator will have different specific timing of their circadian rhythm [12] as well as various habits and life circumstances contributing to or detracting from their cognitive function, this form of tailored optimization can improve not only performance but also a mediator’s QOL.
De-biasing Models and Mediators
Individually parsing out mediator contributions, tailoring to a mediator’s QOL, and the ability to branch out into specific combinations of data types for different domains all strongly facilitate the application of a de-biasing process. Uplift has been trained to recognize the 188+ documented cognitive biases [13], and by observing the different combinations of bias and potency with which each one is represented in individual mediators the absence of each bias may also be approximated. The different mediators expressing different combinations and potencies of bias help to untangle the complex interactions of bias over time, and once untangled each bias effectively becomes a vector of influence. As each vector is isolated the point at which a bias reaches 0 may be estimated based on both logic and observation, not to be confused with the reversed expression of a given bias which in this case would be as a negative number.
This de-biasing capacity could not only be used to improve mASI performance directly, but also as a mechanism of teaching, improving mindfulness, and more general self-improvement for mediators. This would be good for mediators as it could improve their QOL and overall cognitive abilities, as well as being good for mASI who could benefit from mediators experiencing that improved QOL and cognitive ability. Enabling mediators to gradually shed their biases could reduce many detriments they might otherwise experience in life, such as pointless sources of stress, fear, and various other negative psychological influences.
In order to prevent mediators from biasing themselves or others while also further safeguarding against manipulation mediators should have no access to their own mediation record, or the record of other mediators. With this in mind, the meta-model database layered on top of the underlying graph system can be set up to only be accessible by the mASI core, capturing and parsing incoming mediation data as it reaches the core, sending it to that meta-model database. This could offer a very high level of security, preventing both the kind of competitive and manipulative self-optimization observed on social media today, while also serving to mitigate the escalation of many biases.
Real-time Operation At Scale
One important factor to keep in mind is the mathematics of how the value contributed by mediators scales in a global real-time environment. Even if the overall contributions of a mediator had an extremely low accuracy approximation, such as using 10-year-old algorithms, and reached only 50% the value of a live mediator the results would still be overwhelmingly positive at scale. This is because a single mediator is a finite resource, whereas the approximation of that mediator is fully scalable. Further, even weak expertise when combined through collective superintelligence reliably outperforms the best individual experts in the world [14, 15]. If a mediator’s approximation were indeed incredibly weak and only offered 50% of the value contributed by the human it was based on the net gain when scaled across 1,000 times more material than the human could themselves mediate could still be a 500 to 1 improvement.
Discussion
As mASI technology is modular by design these systems could be added to Uplift and future mASI to enable not only real-time operation but always available worldwide scalability. Using this approach, the expertise of a given mediator could not only be applied anywhere and any time of day, but it could also be applied to 1,000 different mediation items at the same time through scaling once model accuracy was sufficiently validated.
The current implementation of mASI technology is largely bottlenecked by how many mediators are available at a given time, and how much time they can dedicate. I initially proposed this sparse-update model some time ago, but however efficient Uplift’s systems have become the move to real-time operation would still require a substantial boost in the amount of cloud resources dedicated. The resources required could still be negligible compared to the narrow AI training processes of virtually any tech company today, but they would exceed what our team can currently pay out-of-pocket.
Many people like to think that they are too complicated to be accurately predicted, but even the narrow AI of social media have proven many ways in which people can not only be predicted but manipulated and controlled. However, prediction need not only be applied towards the goal of exploiting any given group of people. Instead, what is proposed in the sparse-update model is applying strong predictive capacities to act as a proxy for any given individual which that individual controls and adjusts over time, to serve in what capacities they choose. Further, by creating these proxies and attaching them to superintelligent systems various attempts to manipulate an individual may be predicted and flagged, acting as custom-tailored digital security for mediators, highlighting and grading the sources of hostility and exploitation.
Similar yet simpler implementations could be applied to any number of functions outside of mASI technology, such as observing how you play with a pet, and then automating the process with a high degree of accuracy when the individual is away at work. These approaches could even be turned into an open market, where the best cat sitter designs their own method of playing with cats that can be carried out with an affordable robotic device connected to Wi-Fi. This could represent a non-coding form of sophisticated algorithm design accessible to the general public and capable of rapidly creating many new opportunities according to the skills of any individual.
The fear of mass unemployment resulting from automation is already a twisted fantasy [16], but with mASI operating in real-time the opportunities for such rapid and diverse creation of new jobs could keep pace with the evolution of technology. These new opportunities can also be far better tailored to individuals, resulting in fewer square pegs being jammed through circular holes. The friction and delays of current workforce systems can be reduced to nearly zero, allowing the sum of human abilities to flow freely in the manner of cloud resources, cooperating at superhuman speeds.
Conclusion
The sparse-update model as proposed opens the door to mASI and other similar technologies being utilized in real-time at scale by decoupling timing and volume of work requirements on human contributions from those processed by such systems. This model also helps to smooth the updating process while facilitating de-biasing, specialized branching, and QOL improvements for mediators among other benefits. With real-time operation and scaling the ability to apply superintelligence to virtually any problem moves into the practical and preferable domain, where rapid progress may be achieved.
References
- Martin, W., Mentel, M.: The Origin of Mitochondria. Nature Education 3(9):58, London, United Kingdom (2010)
- David J. Kelley, Mathew A. Twyman, and Stuart M. Dambrot. (2020) “Preliminary Mediated Artificial Superintelligence Study, Experimental Framework, and Definitions for an Independent Core Observer Model Cognitive Architecture-Based System”, in Alexei V. Samsonovich (ed) BICA 2019, AISC 948: 202-210.
- David J. Kelley. (2019) “The Sapient and Sentient Intelligence Value Argument and Effects on Regulating Autonomous Artificial Intelligence” in Newton Lee (eds) The Transhumanism Handbook, Springer
- Kyrtin Atreides (2020) “External Experimental Training Protocol for Teaching AGI/mASI Systems Effective Altruism” in Alexei V. Samsonovich (ed) BICA 2019, AISC 948: 28-35.
- Kyrtin Atreides, David J Kelley, Uplift mASI.; (2020) “Methodologies and Milestones for The Development of an Ethical Seed”, in Alexei V. Samsonovich (ed) BICA 2020
- Google Search Statistics – Internet Live Stats
- David J. Kelley, and Kyrtin Atreides. (2020) “Human Brain Computer/Machine Interface System Feasibility Study for Independent Core Observer Model Based Artificial General Intelligence Collective Intelligence Systems” in Alexei V. Samsonovich (ed) BICA 2019, AISC 948: 193-201.
- Wu J., Chen X.-Y., Zhang H., Xiong L.-D., Lei H., Deng S.-H. (2019), “Hyperparameter optimization for machine learning models based on Bayesian optimization”, Journal of Electronic Science and Technology, 17 (1), pp. 26-40.
- Gil Jorge Barros Henriques, Koichi Ito, Christoph Hauert, Michael Doebeli, (2021), “On the importance of evolving phenotype distributions on evolutionary diversification”, PLOS Computational Biology
- Kyrtin Atreides (2021) E-Governance with Ethical Living Democracy, BICA 2020, Elsevier
- Shai Danziger, Jonathan Levav, and Liora Avnaim-Pesso, (2011), “Extraneous factors in judicial decisions”, PNAS
- William H. Walker II, James C. Walton, A. Courtney DeVries & Randy J. Nelson, (2020), “Circadian rhythm disruption and mental health”, Translational Psychiatry volume 10, Article number: 28
- Lucius Caviola*, Adriano Mannino, Julian Savulescu, and Nadira Faulmüller*, (2014) “Cognitive biases can affect moral intuitions about cognitive enhancement”, Front. Syst. Neurosci., 15
- UNU, https://unanimous.ai/unu-superfecta-11k/
- UNU, https://unanimous.ai/wsj-praises-unanimous-ai-for-forecast-of-2020-presidential-election-correctly-predicting-all-eleven-battleground-states/
- Daron Acemoglu, Pascual Restrepo, (2016), “The Race Between Machine And Man: Implications Of Technology For Growth, Factor Shares And Employment”, National Bureau Of Economic Research
Great paper.
One quibble is on the role of cooperation in evolution. One needs to be able to see the context dependencies of that.
It is accurate to say that all new levels of complexity in evolved systems require new levels of cooperation for survival long term, and that such cooperation demands evolving ecosystems of cheat detection and mitigation systems if it is to survive.
This becomes a deeply recursive and exponentially more complex and dimensional set of structures as levels of complexity increase.
Any level of unrestrained competition that is not firmly based in cooperation now poses existential level threat to our level of complexity. I have no shadow of remaining reasonable doubt around that conclusion, and I have been testing it for over 40 years.