In today's landscape, data serves as the fuel and AI as the engine for innovation and competitive advantage, driving organizations to leverage their information as a strategic asset to unlock insights, automate processes, and accelerate innovation. However, without a unified approach to governing both data and AI, significant risks emerge, including inaccurate insights, pervasive data quality issues, inherent model bias, and critical legal and ethical implications that erode trust; therefore, to innovate both swiftly and safely, a shared understanding is paramount, and this glossary provides a comprehensive vocabulary for the interconnected worlds of data and AI, equipping you with the essential terms and phrases needed to cultivate a trusted, governed, and high-performing data and AI practice.
A
-
AI governance is the application of rules, processes, and responsibilities to drive maximum value from your automated data products by ensuring applicable, streamlined, and ethical AI practices that mitigate risk, adhere to legal requirements, and protect privacy.
Learn more about AI governance -
The simulation of human intelligence in machines that are programmed to think and learn like humans. AI can take on tasks that were previously done by people including problem solving repetitive tasks.
-
How AI models will be used to solve a specific problem. Examples include chatbots, fraud detection, and personalized product recommendations.
-
A member-based foundation that aims to harness the collective power and contributions of the global open-source community to develop AI testing tools to enable responsible AI. The Foundation promotes best practices and standards for AI.
Learn more about AI Verify Foundation -
Refers to AI systems with human-level cognitive abilities, capable of understanding, learning, and applying knowledge across various domains autonomously.
-
The obligation for individuals and organizations to take ownership of their data practices and AI decisions, ensuring they are answerable for the outcomes and impacts those systems create.
-
A methodology or collection of directives and regulations crafted to execute a defined task or address a specific issue, employing computational resources.
-
An official, independent examination and verification to ensure that AI and the data that drives it meet specific criteria set forth by governing or regulatory entities.
-
The principle of hiding the underlying technical complexity (e.g., table joins, cryptic column names) from the end-user.
-
Metadata that is continuously collected, analyzed, and used to automate intelligence and workflow actions across data systems to improve data discovery, quality and governance.
-
An autonomous AI system that can proactively perceive its environment, make decisions, and take actions to achieve specific goals (e.g., an agent that automatically detects and remediates a data quality issue).
Learn more about Agentic AI -
The degradation of a model's predictive performance over time, often because the new, real-world data it sees no longer matches the data it was trained on.
-
Methods such as SHAP and LIME that help explain model predictions by tracing them back to specific features and training data with documented lineage.
-
Policies, workflows and controls for managing the development, validation, approval, deployment, monitoring, and retirement of machine learning models.
Learn more about AI model governance -
A comprehensive catalog of all AI and machine learning models in use across the organization, with metadata about data dependencies, ownership, and risk classification.
-
Continuous tracking of deployed model performance, accuracy, drift, fairness, and data quality to ensure ongoing reliability and regulatory compliance.
-
The gradual decline in a model’s accuracy or quality metrics over time due to data drift, requiring investigation through lineage and quality monitoring.
-
A centralized system for versioning, cataloging, storing, and governing machine learning models and their associated data lineage throughout the ML lifecycle.
-
Governance rules that define when, how, and under what data quality conditions machine learning models must be updated with new training data.
-
The controlled process of decommissioning outdated or risky models from production with proper impact analysis, communication, and governance documentation.
-
A categorization system for models based on potential business impact, regulatory exposure, data sensitivity, and ethical implications requiring different governance levels.
-
The practice of tracking and managing different versions of machine learning models, including their metadata, training data, performance metrics, and relationships to data assets.
-
Structured testing of AI systems by adversarial teams to identify vulnerabilities, hallucinations, bias, security weaknesses or unsafe behavior before production deployment.
-
Potential harm arising from AI model behavior, data quality issues, bias, misuse or regulatory non-compliance that could impact individuals, organizations or society.
-
The use of AI and machine learning to automate IT operations, including data pipeline anomaly detection, capacity planning, and root cause analysis for data systems.
-
The obligation to justify automated decisions made by AI systems and mitigate potential harm, ensuring transparency and responsibility throughout the AI lifecycle.
-
A mechanism that enables applications or services to exchange data and functionality through standardized requests and responses, essential for data product consumption.
-
The process of identifying patterns in data that are outside the normal ranges.
-
A chronological record of data access, modifications, policy changes and governance activities that provides evidence for compliance verification and security investigations.