4. Principle of safety

Publisher: Ministry of Internal Affairs and Communications (MIC), the Government of Japan

Developers should take it into consideration that AI systems will not harm the life, body, or property of users or third parties through actuators or other devices.
[Comment]
AI systems which are supposed to be subject to this principle are such ones that might harm the life, body, or property of users or third parties through actuators or other devices.
It is encouraged that developers refer to relevant international standards and pay attention to the followings, with particular consideration of the possibility that outputs or programs might change as a result of learning or other methods of AI systems:
● To make efforts to conduct verification and validation in advance in order to assess and mitigate the risks related to the safety of the AI systems.
● To make efforts to implement measures, throughout the development stage of AI systems to the extent possible in light of the characteristics of the technologies to be adopted, to contribute to the intrinsic safety (reduction of essential risk factors such as kinetic energy of actuators) and the functional safety (mitigation of risks by operation of additional control devices such as automatic braking) when AI systems work with actuators or other devices.
And
● To make efforts to explain the designers’ intent of AI systems and the reasons for it to stakeholders such as users, when developing AI systems to be used for making judgments regarding the safety of life, body, or property of users and third parties (for example, such judgments that prioritizes life, body, property to be protected at the time of an accident of a robot equipped with AI).

Related Principles

Published by: Ministry of Internal Affairs and Communications (MIC), the Government of Japan in AI R&D Principles

Developers should pay attention to the verifiability of inputs outputs of AI systems and the explainability of their judgments.
[Comment]
AI systems which are supposed to be subject to this principle are such ones that might affect the life, body, freedom, privacy, or property of users or third parties.
It is desirable that developers pay attention to the verifiability of the inputs and outputs of AI systems as well as the explainability of the judgment of AI systems within a reasonable scope in light of the characteristics of the technologies to be adopted and their use, so as to obtain the understanding and trust of the society including users of AI systems.
[Note]
Note that this principle is not intended to ask developers to disclose algorithms, source codes, or learning data. In interpreting this principle, consideration to privacy and trade secrets is also required.

Published by: Ministry of Internal Affairs and Communications (MIC), the Government of Japan in AI R&D Principles

Developers should pay attention to the controllability of AI systems.
[Comment]
In order to assess the risks related to the controllability of AI systems, it is encouraged that developers make efforts to conduct verification and validation in advance. One of the conceivable methods of risk assessment is to conduct experiments in a closed space such as in a laboratory or a sandbox in which security is ensured, at a stage before the practical application in society.
In addition, in order to ensure the controllability of AI systems, it is encouraged that developers pay attention to whether the supervision (such as monitoring or warnings) and countermeasures (such as system shutdown, cut off from networks, or repairs) by humans or other trustworthy AI systems are effective, to the extent possible in light of the characteristics of the technologies to be adopted.
[Note]
Verification and validation are methods for evaluating and controlling risks in advance. Generally, the former is used for confirming formal consistency, while the latter is used for confirming substantial validity. (See, e.g., The Future of Life Institute (FLI), Research Priorities for Robust and Beneficial Artificial Intelligence (2015)).
[Note]
Examples of what to see in the risk assessment are risks of reward hacking in which AI systems formally achieve the goals assigned but substantially do not meet the developer's intents, and risks that AI systems work in ways that the developers have not intended due to the changes of their outputs and programs in the process of the utilization with their learning, etc. For reward hacking, see, e.g., Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman & Dan Mané, Concrete Problems in AI Safety, arXiv: 1606.06565 [cs.AI] (2016).

Published by: Ministry of Internal Affairs and Communications (MIC), the Government of Japan in AI R&D Principles

Developers should pay attention to the security of AI systems.
[Comment]
In addition to respecting international guidelines on security such as “OECD Guidelines for the Security of Information Systems and Networks,” it is encouraged that developers pay attention to the followings, with consideration of the possibility that AI systems might change their outputs or programs as a result of learning or other methods:
● To pay attention, as necessary, to the reliability (that is, whether the operations are performed as intended and not steered by unauthorized third parties) and robustness (that is, tolerance to physical attacks and accidents) of AI systems, in addition to: (a) confidentiality; (b) integrity; and (c) availability of information that are usually required for ensuring the information security of AI systems.
● To make efforts to conduct verification and validation in advance in order to assess and control the risks related to the security of AI systems.
● To make efforts to take measures to maintain the security to the extent possible in light of the characteristics of the technologies to be adopted throughout the process of the development of AI systems (“security by design”).

Users should make efforts to utilize AI systems or AI services in a proper scope and manner, under the proper assignment of roles between humans and AI systems, or among users.
[Main points to discuss]
A) Utilization in the proper scope and manner
On the basis of the provision of information and explanation from developers, etc. and with consideration of social contexts and circumstances, users may be expected to use AI in the proper scope and manner. In addition, users may be expected to recognize benefits and risks, understand proper uses, acquire necessary knowledge and skills and so on before using AI, according to the characteristics, usage situations, etc. of AI. Furthermore, users may be expected to check regularly whether they use AI in an appropriate scope and manner.
B) Proper balance of benefits and risks of AI
AI service providers and business users may be expected to take into consideration proper balance between benefits and risks of AI, including the consideration of the active use of AI for productivity and work efficiency improvements, after appropriately assessing risks of AI.
C) Updates of AI software and inspections repairs, etc. of AI
Through the process of utilization, users may be expected to make efforts to update AI software and perform inspections, repairs, etc. of AI in order to improve the function of AI and to mitigate risks.
D) Human Intervention
Regarding the judgment made by AI, in cases where it is necessary and possible (e.g., medical care using AI), humans may be expected to make decisions as to whether to use the judgments of AI, how to use it etc. In those cases, what can be considered as criteria for the necessity of human intervention?
In the utilization of AI that operates through actuators, etc., in the case where it is planned to shift to human operation under certain conditions, what kind of matters are expected to be paid attention to?
[Points of view as criteria (example)]
• The nature of the rights and interests of indirect users, et al., and their intents, affected by the judgments of AI.
• The degree of reliability of the judgment of AI (compared with reliability of human judgment).
• Allowable time necessary for human judgment
• Ability expected to be possessed by users
E) Role assignments among users
With consideration of the volume of capabilities and knowledge on AI that each user is expected to have and ease of implementing necessary measures, users may be expected to play such roles as seems to be appropriate and also to bear the responsibility.
F) Cooperation among stakeholders
Users and data providers may be expected to cooperate with stakeholders and to work on preventive or remedial measures (including information sharing, stopping and restoration of AI, elucidation of causes, measures to prevent recurrence, etc.) in accordance with the nature, conditions, etc. of damages caused by accidents, security breaches, privacy infringement, etc. that may occur in the future or have occurred through the use of AI.
What is expected reasonable from a users point of view to ensure the above effectiveness?

Users should take into consideration that AI systems or AI services in use will not harm the life, body, or property of users, indirect users or third parties through the actuators or other devices.
[Main points to discuss]
A) Consideration for the life, body, or property
In the case of using AI in fields where AI systems might harm the life, body, or property, such as the fields of medical care and autonomous driving, with consideration of the nature, conditions, etc. of assumed damages, users may be expected to take into consideration that AI will not harm the life, body, or property through the actuators or other devices, by inspecting and repairing AI, updating AI software, etc. as necessary.
In addition, users may be expected to consider in advance measures to be taken, in case AI might harm the life, body, or property through the actuators or other devices.