<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[DataTweaks]]></title><description><![CDATA[A technical journal documenting experiments in AI Engineering, Large Language Models, and scalable Data Systems. Focusing on optimizing RAG pipelines, MLOps, an]]></description><link>https://datatweaks.com</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1768999931232/95e81157-551d-4b22-b857-724c5133d456.png</url><title>DataTweaks</title><link>https://datatweaks.com</link></image><generator>RSS for Node</generator><lastBuildDate>Sun, 03 May 2026 01:25:59 GMT</lastBuildDate><atom:link href="https://datatweaks.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Taming AWS Bedrock with Asyncio: A Deep Dive into aioboto3]]></title><description><![CDATA[The Problem: When Sync Meets Stream
Picture this: You're building a real-time data pipeline. Documents are flowing in. Users are waiting for AI-generated answers. And somewhere in the middle, your code is doing... nothing.
# The villain of our story
...]]></description><link>https://datatweaks.com/taming-aws-bedrock-with-asyncio-a-deep-dive-into-aioboto3</link><guid isPermaLink="true">https://datatweaks.com/taming-aws-bedrock-with-asyncio-a-deep-dive-into-aioboto3</guid><category><![CDATA[Python]]></category><category><![CDATA[asyncio]]></category><category><![CDATA[AWS]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[Open Source]]></category><dc:creator><![CDATA[Gautam Raj]]></dc:creator><pubDate>Wed, 07 Jan 2026 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1769003337992/be5ab527-fc26-4099-9212-48e6d3b79326.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-the-problem-when-sync-meets-stream"><strong>The Problem: When Sync Meets Stream</strong></h2>
<p>Picture this: You're building a real-time data pipeline. Documents are flowing in. Users are waiting for AI-generated answers. And somewhere in the middle, your code is doing... <em>nothing</em>.</p>
<pre><code class="lang-python"><span class="hljs-comment"># The villain of our story</span>
response = boto3.client(<span class="hljs-string">"bedrock-runtime"</span>).converse(...)  <span class="hljs-comment"># Blocks everything</span>
</code></pre>
<p>That innocent-looking line? It's a traffic jam. While waiting for AWS Bedrock to respond, your entire pipeline freezes. Other documents pile up. Users tap their fingers. Your monitoring dashboard turns an angry shade of red.</p>
<p><strong>The solution?</strong> Asyncio. But here's the catch — <code>boto3</code> doesn't speak async. Here comes<code>aioboto3</code>.</p>
<hr />
<h2 id="heading-what-is-aioboto3"><strong>What is aioboto3?</strong></h2>
<p>Think of <code>aioboto3</code> as boto3's caffeinated cousin. Same AWS API, but it plays nice with Python's <code>async/await</code> syntax.</p>
<pre><code class="lang-python"><span class="hljs-comment"># boto3 (synchronous - blocks)</span>
client = boto3.client(<span class="hljs-string">"bedrock-runtime"</span>)response = client.converse(...)  <span class="hljs-comment"># ⏸️ Waiting...</span>
<span class="hljs-comment"># aioboto3 (asynchronous - non-blocking)</span>
session = aioboto3.Session()
<span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> session.client(<span class="hljs-string">"bedrock-runtime"</span>) <span class="hljs-keyword">as</span> client:
    response = <span class="hljs-keyword">await</span> client.converse(...)  <span class="hljs-comment"># 🚀 Other code runs meanwhile!</span>
</code></pre>
<p>The magic is in that <code>await</code>. While Bedrock thinks about your prompt, Python can process other documents, handle other requests, or just twiddle its computational thumbs more efficiently.</p>
<h2 id="heading-the-real-world-challenge"><strong>The Real-World Challenge</strong></h2>
<p>I recently contributed native AWS Bedrock support to <a target="_blank" href="https://github.com/pathwaycom/pathway">Pathway</a>, a real-time data processing framework with 57K+ GitHub stars. The challenge?</p>
<p><strong>Build async wrappers for:</strong></p>
<ol>
<li><p><strong>BedrockChat</strong> — Conversational AI (Claude, Llama, Titan, Mistral)</p>
</li>
<li><p><strong>BedrockEmbedder</strong> — Vector embeddings for RAG pipelines</p>
</li>
</ol>
<p>And they had to:</p>
<ul>
<li><p>✅ Never block the event loop</p>
</li>
<li><p>✅ Handle retries gracefully</p>
</li>
<li><p>✅ Support multiple AWS authentication methods</p>
</li>
<li><p>✅ Play nicely with Pathway's streaming architecture</p>
</li>
</ul>
<p>Let me show you how we solved it.</p>
<hr />
<h2 id="heading-lesson-1-session-management-matters"><strong>Lesson 1: Session Management Matters</strong></h2>
<p>My first attempt was... naive:</p>
<pre><code class="lang-python"><span class="hljs-comment"># ❌ Don't do this</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_embedding</span>(<span class="hljs-params">text</span>):</span>
    session = aioboto3.Session()  <span class="hljs-comment"># New session every call!    </span>
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> session.client(<span class="hljs-string">"bedrock-runtime"</span>) <span class="hljs-keyword">as</span> client:
        <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> client.invoke_model(...)
</code></pre>
<p>Creating a new session per request is like buying a new car for every grocery trip. Expensive, wasteful, and your wallet (and AWS bill) will cry.</p>
<p><strong>The fix?</strong> Create the session once in the constructor:</p>
<pre><code class="lang-python"><span class="hljs-comment"># ✅ Do this instead</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">BedrockEmbedder</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, region_name, **credentials</span>):</span>
        self._session = aioboto3.Session(
            aws_access_key_id=credentials.get(<span class="hljs-string">"aws_access_key_id"</span>),
            aws_secret_access_key=credentials.get(<span class="hljs-string">"aws_secret_access_key"</span>),
            region_name=region_name,
        )
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">embed</span>(<span class="hljs-params">self, text</span>):</span>
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> self._session.client(<span class="hljs-string">"bedrock-runtime"</span>) <span class="hljs-keyword">as</span> client:
            <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> client.invoke_model(...)
</code></pre>
<p>One session, reused forever. The code reviewer who caught this deserves a cookie. 🍪</p>
<h2 id="heading-lesson-2-the-async-with-dance"><strong>Lesson 2: The</strong> <code>async with</code> Dance</h2>
<p>Here's a pattern you'll use constantly with aioboto3:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> self._session.client(<span class="hljs-string">"bedrock-runtime"</span>) <span class="hljs-keyword">as</span> client:    response = <span class="hljs-keyword">await</span> client.converse(...)
</code></pre>
<p>Why <code>async with</code>? Two reasons:</p>
<ol>
<li><p><strong>Resource cleanup</strong>: Connections are closed properly, even if errors occur</p>
</li>
<li><p><strong>Connection pooling</strong>: Under the hood, aioboto3 reuses connections efficiently</p>
</li>
</ol>
<p>Think of it as a responsible adult cleaning up after a party. The music stops, but the house doesn't stay trashed.</p>
<hr />
<h2 id="heading-lesson-3-bedrocks-format-quirks"><strong>Lesson 3: Bedrock's Format Quirks</strong></h2>
<p>AWS Bedrock doesn't speak "OpenAI". It has its own dialect:</p>
<pre><code class="lang-python"><span class="hljs-comment"># OpenAI format (what users expect)</span>
messages = [
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are helpful."</span>},
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hello!"</span>}]
<span class="hljs-comment"># Bedrock format (what AWS expects)</span>
messages = [
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: [{<span class="hljs-string">"text"</span>: <span class="hljs-string">"Hello!"</span>}]}]
<span class="hljs-comment"># Plus: system prompts go in a SEPARATE parameter!</span>
system = [{<span class="hljs-string">"text"</span>: <span class="hljs-string">"You are helpful."</span>}]
</code></pre>
<p>So I built a translator:</p>
<pre><code class="lang-python"><span class="hljs-meta">@staticmethod</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_convert_messages_to_bedrock_format</span>(<span class="hljs-params">messages</span>):</span>
    bedrock_messages = []
    <span class="hljs-keyword">for</span> msg <span class="hljs-keyword">in</span> messages:
        role = msg.get(<span class="hljs-string">"role"</span>, <span class="hljs-string">"user"</span>)
        content = msg.get(<span class="hljs-string">"content"</span>, <span class="hljs-string">""</span>)

        <span class="hljs-comment"># Skip system messages (handled separately)</span>
        <span class="hljs-keyword">if</span> role == <span class="hljs-string">"system"</span>:
            <span class="hljs-keyword">continue</span>

        <span class="hljs-comment"># Wrap content in Bedrock's expected format</span>
        bedrock_messages.append({
            <span class="hljs-string">"role"</span>: role,
            <span class="hljs-string">"content"</span>: [{<span class="hljs-string">"text"</span>: content}]
        })

    <span class="hljs-keyword">return</span> bedrock_messages
</code></pre>
<p>Users write OpenAI-style messages. Bedrock gets what it wants. Everyone's happy.</p>
<hr />
<h2 id="heading-lesson-4-model-specific-request-bodies"><strong>Lesson 4: Model-Specific Request Bodies</strong></h2>
<p>Here's a fun surprise: Different Bedrock embedding models expect different JSON formats.</p>
<pre><code class="lang-python"><span class="hljs-comment"># Amazon Titan wants:</span>
{<span class="hljs-string">"inputText"</span>: <span class="hljs-string">"Hello world"</span>}
<span class="hljs-comment"># Cohere wants:</span>
{<span class="hljs-string">"texts"</span>: [<span class="hljs-string">"Hello world"</span>], <span class="hljs-string">"input_type"</span>: <span class="hljs-string">"search_document"</span>}
</code></pre>
<p>Same API endpoint. Different payloads. Classic AWS.</p>
<p>The solution? Sniff the model ID and adapt:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">embed</span>(<span class="hljs-params">self, text, model_id</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-string">"titan"</span> <span class="hljs-keyword">in</span> model_id.lower():
        request_body = {<span class="hljs-string">"inputText"</span>: text}
    <span class="hljs-keyword">elif</span> <span class="hljs-string">"cohere"</span> <span class="hljs-keyword">in</span> model_id.lower():
        request_body = {
            <span class="hljs-string">"texts"</span>: [text],
            <span class="hljs-string">"input_type"</span>: <span class="hljs-string">"search_document"</span>
        }
    <span class="hljs-keyword">else</span>:
        <span class="hljs-comment"># Default to Titan format</span>
        request_body = {<span class="hljs-string">"inputText"</span>: text}

    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> self._session.client(<span class="hljs-string">"bedrock-runtime"</span>) <span class="hljs-keyword">as</span> client:
        response = <span class="hljs-keyword">await</span> client.invoke_model(
            modelId=model_id,
            body=json.dumps(request_body),
            contentType=<span class="hljs-string">"application/json"</span>
        )
</code></pre>
<p>Is it elegant? Debatable. Does it work? Absolutely.</p>
<hr />
<h2 id="heading-lesson-5-response-parsing-gymnastics"><strong>Lesson 5: Response Parsing Gymnastics</strong></h2>
<p>Of course, if requests are different, responses are too:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Titan response:</span>
{<span class="hljs-string">"embedding"</span>: [<span class="hljs-number">0.1</span>, <span class="hljs-number">0.2</span>, <span class="hljs-number">0.3</span>, ...]}
<span class="hljs-comment"># Cohere response:</span>
{<span class="hljs-string">"embeddings"</span>: [[<span class="hljs-number">0.1</span>, <span class="hljs-number">0.2</span>, <span class="hljs-number">0.3</span>, ...]]}
</code></pre>
<p>Note the subtle plural and extra nesting. AWS keeps us on our toes:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Parse based on model</span>
<span class="hljs-keyword">if</span> <span class="hljs-string">"titan"</span> <span class="hljs-keyword">in</span> model_id.lower():
    embedding = result.get(<span class="hljs-string">"embedding"</span>, [])
<span class="hljs-keyword">elif</span> <span class="hljs-string">"cohere"</span> <span class="hljs-keyword">in</span> model_id.lower():
    embeddings = result.get(<span class="hljs-string">"embeddings"</span>, [[]])
    embedding = embeddings[<span class="hljs-number">0</span>] <span class="hljs-keyword">if</span> embeddings <span class="hljs-keyword">else</span> []
</code></pre>
<hr />
<h2 id="heading-lesson-6-error-handling-in-async-land"><strong>Lesson 6: Error Handling in Async Land</strong></h2>
<p>Async code needs async error handling. Here's the pattern I used:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat</span>(<span class="hljs-params">self, messages, **kwargs</span>):</span>
    model_id = kwargs.get(<span class="hljs-string">"model_id"</span>)
    <span class="hljs-keyword">if</span> model_id <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        <span class="hljs-keyword">raise</span> ValueError(
            <span class="hljs-string">"`model_id` is required. "</span>
            <span class="hljs-string">"Provide it in constructor or function call."</span>
        )

    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> self._session.client(<span class="hljs-string">"bedrock-runtime"</span>) <span class="hljs-keyword">as</span> client:
            response = <span class="hljs-keyword">await</span> client.converse(
                modelId=model_id,
                messages=self._convert_messages(messages)
            )
    <span class="hljs-keyword">except</span> ClientError <span class="hljs-keyword">as</span> e:
        <span class="hljs-comment"># Log and re-raise with context</span>
        logger.error(<span class="hljs-string">f"Bedrock API error: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">raise</span>

    <span class="hljs-keyword">return</span> self._extract_response_text(response)
</code></pre>
<p><strong>Key principle</strong>: Fail fast with clear messages. When someone passes <code>model_id=None</code>, don't let it bubble up as a cryptic AWS error. Tell them exactly what's wrong.</p>
<hr />
<h2 id="heading-the-full-picture"><strong>The Full Picture</strong></h2>
<p>Here's the final architecture:</p>
<pre><code class="lang-python">┌─────────────────────────────────────────────────────────────┐
│                     BedrockChat                             │
├─────────────────────────────────────────────────────────────┤
│  __init__()                                                 │
│    └─► Create aioboto3.Session (once)                       │
│                                                             │
│  __wrapped__() [<span class="hljs-keyword">async</span>]                                      │
│    ├─► Convert messages to Bedrock format                   │
│    ├─► Extract system prompts                               │
│    ├─► Build inference config (temp, max_tokens, etc.)      │
│    ├─► <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> session.client() <span class="hljs-keyword">as</span> client:               │
│    │       └─► <span class="hljs-keyword">await</span> client.converse()                      │
│    └─► Extract <span class="hljs-keyword">and</span> <span class="hljs-keyword">return</span> response text                     │
└─────────────────────────────────────────────────────────────┘
</code></pre>
<p>Clean. Async. Non-blocking.</p>
<hr />
<h2 id="heading-performance-before-vs-after"><strong>Performance: Before vs After</strong></h2>
<p>The impact of going async became clear during testing:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Scenario</strong></td><td><strong>Sync (boto3)</strong></td><td><strong>Async (aioboto3)</strong></td></tr>
</thead>
<tbody>
<tr>
<td>1 request</td><td>~500ms</td><td>~500ms</td></tr>
<tr>
<td>10 concurrent</td><td>~5000ms</td><td>~600ms</td></tr>
<tr>
<td>100 concurrent</td><td>~50000ms</td><td>~2000ms</td></tr>
</tbody>
</table>
</div><p>For single requests? No difference. For streaming pipelines processing hundreds of documents? <strong>Night and day.</strong></p>
<hr />
<h2 id="heading-key-takeaways"><strong>Key Takeaways</strong></h2>
<ol>
<li><p><strong>Session goes in</strong> <code>init</code> — Create once, reuse always</p>
</li>
<li><p><strong>Always use</strong> <code>async with</code> — For proper resource cleanup</p>
</li>
<li><p><strong>Model-specific handling</strong> — AWS models speak different dialects</p>
</li>
<li><p><strong>Fail fast</strong> — Validate inputs before touching the network</p>
</li>
<li><p><strong>Log everything</strong> — Your future debugging self will thank you</p>
</li>
</ol>
<hr />
<h2 id="heading-try-it-yourself"><strong>Try It Yourself</strong></h2>
<p>The code is now part of <a target="_blank" href="https://github.com/pathwaycom/pathway">Pathway</a>. Install and use:</p>
<pre><code class="lang-python">pip install pathway
<span class="hljs-keyword">from</span> pathway.xpacks.llm <span class="hljs-keyword">import</span> llms, embedders

<span class="hljs-comment"># Chat with Claude</span>
chat = llms.BedrockChat(
    model_id=<span class="hljs-string">"anthropic.claude-3-sonnet-20240229-v1:0"</span>,
    region_name=<span class="hljs-string">"us-east-1"</span>)
<span class="hljs-comment"># Embed with Titan</span>
embedder = embedders.BedrockEmbedder(
    model_id=<span class="hljs-string">"amazon.titan-embed-text-v2:0"</span>,
    region_name=<span class="hljs-string">"us-east-1"</span>
)
</code></pre>
<p>Or check out the <a target="_blank" href="https://github.com/pathwaycom/pathway/pull/170">Pull Request</a> to see the full implementation and code review process.</p>
<hr />
<h2 id="heading-final-thoughts"><strong>Final Thoughts</strong></h2>
<p>Async programming in Python has a learning curve. Async programming with AWS services has a steeper one. But once you understand the patterns:</p>
<ul>
<li><p><strong>Sessions over clients</strong></p>
</li>
<li><p><code>async with</code> for resource management</p>
</li>
<li><p><strong>Format translation layers</strong></p>
</li>
</ul>
<p>...you can build high-performance, non-blocking integrations that scale beautifully.</p>
<p>The next time your pipeline needs to call AWS Bedrock a thousand times? You'll know exactly what to do.</p>
<hr />
<p><strong>Happy async coding! 🚀</strong></p>
]]></content:encoded></item></channel></rss>