Alexey Piskovatskov 2/4/26 Alexey Piskovatskov 2/4/26

From Context to Control: How MCP and Contextual Engineering Align with NIST CSF

As enterprises move from experimentation to production-grade AI systems, security frameworks are no longer optional guardrails — they are foundational architecture. In a previous post, we explored how Model Context Protocol (MCP) and contextual engineering enable reliable, scalable AI by structuring how models receive, interpret, and act on information. In this continuation, we examine how those same mechanisms map naturally to the NIST Cybersecurity Framework (CSF) and why that alignment matters for organizations deploying AI in regulated, high-risk environments.

At a high level, NIST CSF provides a shared language for managing cybersecurity risk across five core functions: Identify, Protect, Detect, Respond, and Recover. MCP and contextual engineering do not replace this framework — they operationalize it for AI systems. Together, they create enforceable boundaries around what AI systems know, what they can do, and how their behavior can be monitored, audited, and corrected over time.

Identify: Defining AI Assets, Boundaries, and Risk

The Identify function focuses on understanding assets, dependencies, and risk exposure. In AI systems, this includes models, prompts, data sources, tools, and decision pathways — many of which are dynamic and opaque without intentional design.

MCP enables explicit declaration of context sources, tool permissions, and execution constraints. Contextual engineering formalizes why specific information is included and when it is appropriate. Together, they transform AI context from an implicit prompt blob into a defined system asset that can be inventoried, classified, and risk-assessed. This directly supports NIST requirements around asset management, governance, and risk understanding — especially critical when AI systems interact with financial, healthcare, or identity data.

Protect: Enforcing Least Privilege at the Context Layer

The Protect function is about safeguards — and for AI systems, the most fragile attack surface is often context itself. Over-broad prompts, unrestricted tool access, and uncontrolled memory introduce silent failure modes and security risk.

Contextual engineering applies least privilege principles to AI inputs, ensuring models only receive the minimum information required for a task. MCP reinforces this by constraining tool invocation, parameter scope, and execution rights at runtime. Rather than relying on policy documents or developer discipline, protection becomes enforceable by system design. This mirrors traditional security controls like IAM and network segmentation, but applied at the AI orchestration layer.

Detect: Observability Into AI Decisions and Behavior

Detection requires visibility — and AI systems are notoriously difficult to observe without structured instrumentation. MCP provides standardized hooks for logging context usage, tool calls, and decision pathways, while contextual engineering defines what signals matter.

This enables organizations to detect anomalies such as unexpected data access, abnormal tool usage, or behavioral drift. From a NIST CSF perspective, this supports continuous monitoring, event analysis, and detection processes that are essential for enterprise environments. Importantly, detection here is not limited to infrastructure-level threats; it extends to semantic and behavioral risks unique to AI systems.

Respond: Containing and Correcting AI Failures

When incidents occur, response speed and clarity matter. Poorly structured AI systems make it difficult to isolate failure causes or apply targeted remediation.

By structuring AI behavior through MCP-defined contracts and context layers, organizations can respond surgically — disabling specific tools, revoking context sources, or tightening execution constraints without shutting down entire systems. Contextual engineering ensures response actions do not introduce new ambiguity or unintended consequences. This maps directly to NIST’s emphasis on coordinated response, mitigation, and communication.

Recover: Learning and Improving After AI Incidents

Recovery is not just about restoration; it’s about improvement. For AI systems, this means refining prompts, adjusting context boundaries, updating safeguards, and strengthening controls based on real-world failures.

Because MCP and contextual engineering make AI behavior explicit and inspectable, post-incident analysis becomes actionable rather than speculative. Organizations can evolve their AI systems in a controlled way — strengthening resilience, updating governance rules, and feeding lessons learned back into system design. This closes the loop envisioned by NIST CSF’s recovery function.

Why This Matters for Enterprise AI

The convergence of MCP, contextual engineering, and NIST CSF represents a shift from AI as experimentation to AI as critical infrastructure. Enterprises do not need new security frameworks for AI — they need AI systems that are compatible with the frameworks they already trust.

By treating context as a governed asset and MCP as an enforcement mechanism, organizations can deploy AI systems that are auditable, defensible, and resilient by design. This alignment is what allows AI to move safely into core business workflows — not despite security requirements, but because of them.

Alexey Piskovatskov 1/28/26 Alexey Piskovatskov 1/28/26

Why MCP and Contextual Engineering Are Both Essential for Enterprise Security AI

As enterprises increasingly adopt AI to support security operations—vulnerability management, compliance reviews, incident triage, and access audits—many teams discover the same uncomfortable truth: AI systems fail not because models are weak, but because context is poorly designed and inconsistently delivered. This is where Model Context Protocol (MCP) and contextual engineering come together. Separately, each solves a different class of problems. Together, they form the foundation for secure, reliable, and auditable AI systems in enterprise environments.

MCP provides the infrastructure layer for context. It standardizes how models access external systems such as ticketing tools, code repositories, vulnerability scanners, and identity platforms. In an enterprise security setting, this means AI agents can retrieve information from Jira, GitHub, IAM systems, or compliance repositories through well-defined, permissioned interfaces rather than raw prompt injection or brittle API wrappers. MCP ensures access is controlled, scoped, and structured—critical requirements when dealing with sensitive security data.

However, access alone does not produce trustworthy results. This is where contextual engineering plays a decisive role. Contextual engineering defines what information the model should see, when it should see it, and how it should be framed. In a security review workflow, for example, the model should not ingest every vulnerability ever recorded. Instead, it should be guided to focus on active, high-severity findings, recent code changes, and relevant compliance controls. Contextual engineering enforces relevance, reduces noise, and prevents overgeneralized or speculative outputs.

Consider an AI-powered security assessment agent reviewing cloud infrastructure readiness against NIST CSF. MCP enables secure, read-only access to cloud configuration data, open Jira issues, recent deployment logs, and compliance documentation. Contextual engineering then constrains the model to evaluate only controls applicable to the organization’s architecture, exclude deprecated resources, and ground every recommendation in retrieved evidence. The result is not a generic security checklist, but a tailored, defensible assessment that aligns with enterprise risk priorities.

One of the most critical benefits of combining MCP with contextual engineering is hallucination prevention. In security contexts, hallucinations are not just inconvenient—they are dangerous. MCP ensures the model retrieves real, authoritative data rather than relying on training priors. Contextual engineering ensures the model is required to use that data, cite it, and reason within defined boundaries. This pairing transforms AI from an advisory guesser into a constrained decision-support system.

Together, MCP and contextual engineering also improve governance and auditability. Security teams must be able to explain why a recommendation was made, what data informed it, and who had access to that data. MCP provides traceable, versioned context sources and access logs. Contextual engineering provides structured reasoning paths, explicit assumptions, and documented decision criteria. This alignment is essential for regulated industries such as fintech, healthcare, and government.

As enterprises move toward agentic security workflows—where AI assists with triage, remediation planning, or compliance validation—the need for both MCP and contextual engineering becomes non-negotiable. MCP creates the secure rails; contextual engineering defines the rules of engagement. Without MCP, AI systems become unsafe. Without contextual engineering, they become unreliable. Together, they enable security teams to deploy AI that is not only powerful, but trustworthy by design.

Alexey Piskovatskov 12/8/25 Alexey Piskovatskov 12/8/25

Understanding HNSW in ChromaDB: The Engine Behind High-Performance Vector Search

As Retrieval-Augmented Generation (RAG) becomes a core architectural pattern for modern AI applications, the efficiency of vector search has never been more critical. Developers rely on vector databases not only to store high-dimensional embeddings but also to retrieve relevant information at low latency and high accuracy—especially when working at scale. ChromaDB, one of the most widely used open-source vector databases, achieves this performance through Hierarchical Navigable Small World (HNSW) graphs, a breakthrough data structure for approximate nearest-neighbor (ANN) search.

HNSW is an ANN algorithm that organizes vectors into a multi-layer graph structure. The upper layers form a sparse network that allows long-range “jumps” across embedding space, while the lower layers gradually increase in density and local connectivity. This hierarchical architecture enables the search algorithm to quickly traverse from coarse-grained layers to the fine-grained, densely connected bottom level—ultimately delivering near-logarithmic query complexity. Instead of scanning through millions of embeddings, the system efficiently navigates the graph to locate the nearest neighbors with high recall. This balance of speed and accuracy makes HNSW an ideal fit for latency-sensitive RAG applications, conversational agents, semantic search systems, and any workload relying on rapid vector similarity lookups.

A key aspect of HNSW’s flexibility in ChromaDB lies in the space parameter, which determines the distance function used throughout the index. ChromaDB natively supports several space types, including "cosine" (for directional similarity), "l2" (Euclidean distance), and "ip" (inner product). This choice directly influences retrieval behavior: cosine distance is ideal for normalized embeddings from large language models or sentence transformers; Euclidean distance is a natural fit for geometric embedding spaces; and inner product works well when maximizing alignment between vectors. Because HNSW operates directly within the chosen metric, ChromaDB can adapt to a wide range of embedding models without requiring custom indexing logic or post-processing steps. The result is a vector search engine that aligns closely with the mathematical properties of the underlying embeddings.

Beyond distance metrics, HNSW in ChromaDB exposes configurable parameters such as M (controlling the number of bi-directional links per node), ef_construction (defining graph search depth during index building), and ef (controlling search breadth at query time). These knobs give developers fine-tuned control over the accuracy-performance tradeoff. Higher values increase recall and precision but require more compute resources; lower values favor faster throughput. Because HNSW supports incremental insertion, new vectors can be added without rebuilding the index, making it well suited for dynamic workflows like real-time document ingestion or continuous model updates.

ChromaDB’s integration of HNSW extends beyond raw vector search. It pairs seamlessly with metadata and document-level filters, allowing developers to combine similarity search with structured constraints—such as filtering by category, timestamp, source type, or any custom attributes. In addition, the database’s flexible persistence options and client libraries make it easy to integrate HNSW-powered retrieval inside RAG pipelines, agent architectures, or operational ML workflows. Whether serving as an embedded engine within a Python application or deployed as a distributed service, ChromaDB maintains HNSW’s performance characteristics even as collections scale into millions of entries.

As organizations increasingly leverage RAG to ground LLMs in proprietary knowledge, retrieval speed and accuracy are becoming competitive differentiators. HNSW provides the computational backbone necessary to meet those demands, enabling ChromaDB to deliver fast, flexible, and high-recall vector search at scale. For engineers looking to build high-performance AI systems—from enterprise knowledge assistants to augmented analytics—understanding HNSW is key to unlocking ChromaDB’s full potential.

Alexey Piskovatskov 12/7/25 Alexey Piskovatskov 12/7/25

LangChain vs. LlamaIndex in RAG context

LangChain and LlamaIndex are two of the most widely used frameworks for building Retrieval-Augmented Generation (RAG) systems, but they serve different roles within the pipeline. LangChain provides a comprehensive toolkit for orchestrating the entire RAG workflow—retrieval, prompt construction, tool integrations, agents, and post-processing—while LlamaIndex focuses more deeply on data ingestion, indexing, and retrieval quality. In a typical RAG setup, LangChain functions as the “application orchestrator,” whereas LlamaIndex serves as the “data engine” responsible for building a highly optimized knowledge base.

In the data preparation and indexing stage, LlamaIndex offers advanced features for chunking, metadata extraction, document hierarchies, and hybrid or graph-based index structures. This makes it exceptionally strong when the quality of retrieved information depends on how the knowledge base is constructed. LangChain also supports document loading and embedding, but LlamaIndex is built specifically to give developers fine-grained control over how data is transformed into vector indexes. These indexing-centric capabilities make LlamaIndex especially effective for improving RAG retrieval relevance and precision.

When orchestrating the live retrieval and generation process, LangChain provides greater flexibility and modularity. It excels at building multi-step chains, coordinating multiple retrievers, calling external tools or APIs, routing queries, and composing different prompts. This makes LangChain a strong choice for complex RAG applications that require logic flows, evaluation loops, or agent-style reasoning. While LlamaIndex also supports retrieval pipelines and query engines, its primary focus is ensuring that the data is structured and accessible rather than orchestrating multi-step decision workflows.

Both frameworks integrate seamlessly with modern vector databases, including ChromaDB. ChromaDB is a popular choice for storing embeddings due to its open-source nature, high performance, and flexible metadata filtering. In LangChain, ChromaDB can be plugged in with just a few lines of code as a VectorStore, allowing LangChain chains and agents to retrieve relevant documents efficiently. LlamaIndex also supports ChromaDB as a storage backend, enabling developers to use LlamaIndex’s powerful indexing and query abstractions on top of the same vector database. This means teams can use ChromaDB as a shared, persistent vector layer regardless of whether the orchestration is done through LangChain, LlamaIndex, or a combination of both.

In production RAG deployments, LangChain and LlamaIndex often work side-by-side, and ChromaDB acts as a reliable vector storage layer for both. LlamaIndex can handle the data ingestion, embedding, and index construction, storing vectors inside ChromaDB. LangChain can then use that same ChromaDB instance to retrieve relevant chunks during runtime, build prompts, and drive multi-step reasoning flows. The result is a flexible, scalable, and high-quality RAG system: LlamaIndex optimizes the data and indexing layer, LangChain manages orchestration and logic, and ChromaDB provides a shared high-speed vector store that both can rely on.

Alexey Piskovatskov 12/5/25 Alexey Piskovatskov 12/5/25

LangChain Expression Language (LCEL)

LangChain Expression Language (LCEL) is a declarative way to build LLM-powered pipelines using simple, chainable components. Instead of writing complex procedural code, LCEL lets developers express a workflow—such as prompting, model invocation, parsing, and post-processing—using a clean, readable syntax. At its core, LCEL revolves around runnables, composable units that each perform a step in the pipeline. These runnables can be linked together using the pipe operator (|), making it easy to construct end-to-end flows that transform inputs into model-ready prompts, generate outputs, and parse results into structured formats.

One of LCEL’s biggest strengths is its flexibility. It allows developers to combine prompts, models, retrievers, tools, and custom Python functions into modular chains that can be reused and extended. Because LCEL is built around standard interfaces, the same chain can run in different environments—locally, in the cloud, or inside async contexts—without code changes. This consistency makes LCEL especially powerful for production RAG systems, agent workflows, and applications requiring reproducible, maintainable LLM logic.