Solution

CoCoS.ai is a distributed, microservice-based solution in the cloud that enables confidential and privacy-preserving AI/ML, i.e. execution of model training and algorithm inference on confidential data sets. Privacy-preservation is considered a “holy grail” of AI. It opens many possibilities, among which is a collaborative, trustworthy AI.Final product enables data scientists to train AI and ML models on confidential data that is never revealed, and can be used for Secure Multi-Party Computation (SMPC). AI/ML on combined data sets that come from different sources will unlock huge value.

CoCoS.ai is enabling the following features:

Data Scientist pipelines with UI

User and key management

Distributed computation orchestration over TEE-enabled machines

Programmable TEE environments (novel protocols)

Result brokering while adhering to the most recent IETF Open Trust Protocol standards

Providing an API for programmable platform manipulation

The Final product enables data scientists to train AI and ML models on confidential data that is never revealed, and can be used for Secure Multi-party Computation (SMPC).

Secure Multi-party Computation (SMPC) allows two or more parties to collectively perform some computation and receive the resulting output without ever exposing any party’s sensitive input. With the advances in the use of machine learning, those parties can be, for example, the owners of some sensitive data on one hand and the providers of machine learning models and code on the other, in a way in which the sensitive data is not exposed to the data processors and vice versa, while the useful result of data processing is still obtained. The traditional enabler of SMPC is cryptography. One branch of research in the last decade has been towards the use of cryptographic algorithms for SMPC (e.g. homomorphic encryption), but the significant number of cryptographic operations required makes these techniques not entirely practical for most real-time online computations. The recent emergence of Trusted Execution Environments (TEEs), which provide hardware-enforced isolation of in-use code and data, allows for more tractable SMPC. The two most prominent technologies are Intel’s SGX (Software Guard Extensions) and AMD SEV (Secure Encrypted Virtualization). Both technologies allow the secure isolation of the code and the data by using real-time encryption of either trusted parts of the applications (enclaves) in the former or whole virtual machines in the latter technology. The cryptographic keys that are used for code and data isolation are randomly generated and stored on the processors and are not exposed to the hypervisor nor to the operating system, allowing data processing to be organized in such a way that the server owner or cloud provider is not able to see the user’s data or code.

Leveraging TEE technologies, it is possible to create a system which can enable SMC. The system has to consist of multiple TEEs, at least one per each party (e.g. TEE for the code, for the data), and one for the central Security Policy Engine (SPE), which distributes the cryptographic keys to the other TEEs and enables their secure communication (Figure 2). Data in transit between the parties and the TEEs has to be encrypted at all times, while the encryption in the TEE is assumed. All the involved parties have to be able to verify that their sensitive information is uploaded to the appropriate TEEs by using the attestation process provided by these technologies.

Central component of the system is the SPE, which manages the other TEEs, provides TEEs with the cryptographic material that allows their secure communication, and finally enables the SMPC. It defines the data and code distribution policy in an assumed semi-honest environment (parties are interested in faithful execution of the SMPC protocol to ensure proper operation but may otherwise act arbitrarily to reveal the secret input of cooperating parties). The design and operation of the SPE are the same in the case of both TEE technologies. What is going to differ is the communication between the TEEs, depending on the technology. In order to avoid the pitfall of moving the trust of this SMPC model into the arms of the SMPC service provider, all the software components of the system will be fully auditable at any moment, and the system will have monitoring capabilities that will allow each party to verify that the other parties are not attempting to access the assets they are not allowed to access.